{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",
        "colab_type": "text"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/github/peremartra/Large-Language-Model-Notebooks-Course/blob/main/6-PRUNING/6_3b_data-driven_width_pruning_llama3.2-1b.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "DT0FAsUnncFY"
      },
      "source": [
        "<div>\n",
        "    <h1>Large Language Models Projects</a></h1>\n",
        "    <h3>Apply and Implement Strategies for Large Language Models</h3>\n",
        "    <h2>Data-Driven Width Pruning Llama 3.2.</h2>\n",
        "    <h3>How to decide with neurons to remove from GLU Structure.</h3>\n",
        "</div>\n",
        "\n",
        "by [Pere Martra](https://www.linkedin.com/in/pere-martra/)\n",
        "\n",
        "_______\n",
        "Models: meta-llama/Llama-3.2-1B\n",
        "\n",
        "Colab Environment: GPU T4.\n",
        "\n",
        "Keys:\n",
        "* Pruning\n",
        "* Structured pruning\n",
        "\n",
        "\n",
        "Related article: [How to Prune LLaMA 3.2 and Similar Large Language Models](ttps://medium.com/towards-data-science/how-to-prune-llama-3-2-and-similar-large-language-models-cf18e9a2afb6.)\n",
        "_______\n",
        "**disclaimer: The pruning section was created after the first edition of the book was published. They are not included in the book’s original content but are intended to supplement and expand on the topics covered.**\n",
        "\n",
        "This is the unofficial repository for the book:\n",
        "        <a href=\"https://amzn.to/4eanT1g\"> <b>Large Language Models:</b> Apply and Implement Strategies for Large Language Models</a> (Apress).\n",
        "        The book is based on the content of this repository, but the notebooks are being updated, and I am incorporating new examples and chapters.\n",
        "        If you are looking for the official repository for the book, with the original notebooks, you should visit the\n",
        "        <a href=\"https://github.com/Apress/Large-Language-Models-Projects\">Apress repository</a>, where you can find all the notebooks in their original format as they appear in the book.\n",
        "______"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "iEwxZCVsoIau"
      },
      "source": [
        "# Introduction\n",
        "This notebook cotinues the work done at: [6_2_pruning_structured_llama3.2-1b_KO.ipynb](https://github.com/peremartra/Large-Language-Model-Notebooks-Course/blob/main/6-PRUNING/6_3_pruning_structured_llama3.2-1b_OK.ipynb) where statics methods has beeen used to select the neurons to prune.\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "eQIxAOPZtPBN"
      },
      "source": [
        "#Install libraries & Configure variables."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 7,
      "metadata": {
        "id": "5zHApVm41HWq"
      },
      "outputs": [],
      "source": [
        "!pip install -q transformers\n",
        "!pip install -q torch\n",
        "!pip install -q datasets\n",
        "#!pip install -q sentencepiece"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 8,
      "metadata": {
        "id": "GJNgRj4M187E"
      },
      "outputs": [],
      "source": [
        "import torch\n",
        "from datasets import load_dataset\n",
        "from transformers import AutoModelForCausalLM, AutoTokenizer\n",
        "from torch import nn\n",
        "from torch.utils.data import DataLoader\n",
        "import os, gc\n",
        "from tqdm import tqdm"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 9,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "tbIyUlXEtbqs",
        "outputId": "9edb5ddc-35b2-4adb-b73a-474738c04171"
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "Using device: cuda\n"
          ]
        }
      ],
      "source": [
        "# Check if GPU is available\n",
        "device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')\n",
        "print(f\"Using device: {device}\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "sM-QwxyKw-YG"
      },
      "source": [
        "#Download model and explore structure"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "q-z_1Zpg2I6u"
      },
      "outputs": [],
      "source": [
        "model_name = 'meta-llama/Llama-3.2-1B'\n",
        "model = AutoModelForCausalLM.from_pretrained(model_name, dtype=torch.float16).to(device)\n",
        "tokenizer = AutoTokenizer.from_pretrained(model_name)\n",
        "tokenizer.pad_token = tokenizer.eos_token  # Set pad token"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 11,
      "metadata": {
        "id": "9UpMD4Hw2MWg"
      },
      "outputs": [],
      "source": [
        "def get_output(prompt, model=model, tokenizer=tokenizer):\n",
        "    inputs = tokenizer(prompt, return_tensors='pt').to(device)\n",
        "    outputs = model.generate(\n",
        "        inputs['input_ids'],\n",
        "        attention_mask=inputs['attention_mask'],\n",
        "        max_length=50,\n",
        "        num_return_sequences=1,\n",
        "        pad_token_id=tokenizer.pad_token_id,\n",
        "        temperature=None,\n",
        "        top_p=None,\n",
        "        do_sample=False,          # Disable sampling\n",
        "        num_beams=5,              # Use beam search\n",
        "        early_stopping=True,      # Stop when end-of-sequence token is generated\n",
        "        no_repeat_ngram_size=2    # Prevent repetition of 2-grams\n",
        "    )\n",
        "    generated = tokenizer.decode(outputs[0], skip_special_tokens=True)\n",
        "    return generated"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "4muyx_8M5OAu"
      },
      "source": [
        "## studying the model structure\n",
        "As demonstrated in the [previous notebook](https://github.com/peremartra/Large-Language-Model-Notebooks-Course/blob/main/6-PRUNING/6_2_pruning_structured_llama3.2-1b_KO.ipynb), studying the structure of the model that will undergo pruning is crucial.\n",
        "\n",
        "In this notebook we improve the width pruning process that could be seen in the Notebook: [Pruning Llama 3.2.](https://github.com/peremartra/Large-Language-Model-Notebooks-Course/blob/main/6-PRUNING/6_3_pruning_structured_llama3.2-1b_OK.ipynb). Where a pruning of the model's GLU structure was performed taking into account its static weights.\n",
        "\n",
        "Here we expand the criterion also using the activations of one of the cpas of the GLU, so the pruning that is performed is considered data-drive."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 12,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "y5Hs4oQ4B7Z0",
        "outputId": "3608047b-d19f-4c48-fb05-30ea0ee3f925"
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "LlamaForCausalLM(\n",
            "  (model): LlamaModel(\n",
            "    (embed_tokens): Embedding(128256, 2048)\n",
            "    (layers): ModuleList(\n",
            "      (0-15): 16 x LlamaDecoderLayer(\n",
            "        (self_attn): LlamaAttention(\n",
            "          (q_proj): Linear(in_features=2048, out_features=2048, bias=False)\n",
            "          (k_proj): Linear(in_features=2048, out_features=512, bias=False)\n",
            "          (v_proj): Linear(in_features=2048, out_features=512, bias=False)\n",
            "          (o_proj): Linear(in_features=2048, out_features=2048, bias=False)\n",
            "        )\n",
            "        (mlp): LlamaMLP(\n",
            "          (gate_proj): Linear(in_features=2048, out_features=8192, bias=False)\n",
            "          (up_proj): Linear(in_features=2048, out_features=8192, bias=False)\n",
            "          (down_proj): Linear(in_features=8192, out_features=2048, bias=False)\n",
            "          (act_fn): SiLUActivation()\n",
            "        )\n",
            "        (input_layernorm): LlamaRMSNorm((2048,), eps=1e-05)\n",
            "        (post_attention_layernorm): LlamaRMSNorm((2048,), eps=1e-05)\n",
            "      )\n",
            "    )\n",
            "    (norm): LlamaRMSNorm((2048,), eps=1e-05)\n",
            "    (rotary_emb): LlamaRotaryEmbedding()\n",
            "  )\n",
            "  (lm_head): Linear(in_features=2048, out_features=128256, bias=False)\n",
            ")\n"
          ]
        }
      ],
      "source": [
        "print(model)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "yPMslK3QCAb1"
      },
      "source": [
        "\n",
        "An MLP block typically consists of layers that scale the data to larger dimensions and others that return it to its original size.\n",
        "\n",
        "In the MLP block of the model, we find two projection layers: `gate_proj` and `up_proj`, both scaling from 2048 to 8192. The purpose of having two layers projecting to the same intermediate size might be related to gating mechanisms. A gating mechanism selectively controls information flow in neural networks by using learned weights to \"gate\" or filter inputs.\n",
        "\n",
        "Also highlight the role the `down_proj` layer plays contracting the information back, after passing through an activation function, to its original size.\n",
        "\n",
        "However, to truly understand how these layers function, we’d need to refer to the model's documentation or even the source code. But, this structure usually indicates, at least, I haven't encountered a case where it doesn't, that the layers performing the upsizing work in pairs, and they cannot be treated as independent linear layers.\n",
        "\n",
        "In other words, any operation we apply to one layer must be replicated in the other. Most importantly, when identifying which neurons have more or less importance, we can't evaluate the neurons of a single layer in isolation; we need to treat them as pairs.\n",
        "\n",
        "For our data-driven pruning process the evaluation we will do will be:\n",
        "* `gate_proj`: Static with the magnitude of its weights.\n",
        "* `up_proj`: Static with the magnitude of its weights.\n",
        "* `down_proj`: Dynamic with activations + Static with the magnitude of its weights."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 13,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "alKH3QH64WFL",
        "outputId": "98ca7a50-dd67-4332-96aa-d1cbe82cd7dd"
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "Generated text: Paris is the capital of France and one of the most visited cities in the world. It is a city with a rich history and culture, as well as a vibrant and diverse population. Paris is home to many famous landmarks, including the Eiff\n"
          ]
        }
      ],
      "source": [
        "# Test the original model\n",
        "prompt = \"Paris is the capital of\"\n",
        "generated = get_output(prompt)\n",
        "print(f\"Generated text: {generated}\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 14,
      "metadata": {
        "id": "8WR96iwq2XYH"
      },
      "outputs": [],
      "source": [
        "def count_parameters(model):\n",
        "    return sum(p.numel() for p in model.parameters())\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 15,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "Kph43oObnet7",
        "outputId": "b4cc5229-11c0-4dd5-c26a-30e5ff11c0cb"
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "Original model parameters: 1235814400\n"
          ]
        }
      ],
      "source": [
        "original_param_count = count_parameters(model)\n",
        "print(f\"Original model parameters: {original_param_count}\")"
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "## Load Dataset"
      ],
      "metadata": {
        "id": "_YjJAxQsVN1C"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "RECOVERY_SAMPLES = 1000\n",
        "BATCH_SIZE = 8\n",
        "MAX_LENGTH = 1024"
      ],
      "metadata": {
        "id": "z5vueVYjVNQn"
      },
      "execution_count": 16,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "We're going to use a generic dataset like wikitext, but data-driven pruning shines when the dataset to use is specific to our model. That is, when we want to specialize it in a specific domain."
      ],
      "metadata": {
        "id": "T32qJOjnjCiZ"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "datawiki = load_dataset('wikitext', 'wikitext-2-raw-v1', split=f'train[:{RECOVERY_SAMPLES}]')\n"
      ],
      "metadata": {
        "id": "LaSSECayVSqi"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "def prepare_dataset(dataset, text_field='text'):\n",
        "  def tokenize_function(examples):\n",
        "      if text_field in examples:\n",
        "          texts = examples[text_field]\n",
        "      elif 'sms' in examples:  # SMS dataset\n",
        "          texts = examples['sms']\n",
        "      elif 'text' in examples:\n",
        "          texts = examples['text']\n",
        "      else:\n",
        "          texts = examples[list(examples.keys())[0]]  # First available field\n",
        "\n",
        "      return tokenizer(\n",
        "          texts,\n",
        "          truncation=True,\n",
        "          padding='max_length',\n",
        "          max_length=MAX_LENGTH,\n",
        "          return_tensors='pt'\n",
        "      )\n",
        "\n",
        "  tokenized = dataset.map(tokenize_function, batched=True, remove_columns=dataset.column_names)\n",
        "  tokenized.set_format(type='torch', columns=['input_ids', 'attention_mask'])\n",
        "  return DataLoader(tokenized, batch_size=BATCH_SIZE, shuffle=False)"
      ],
      "metadata": {
        "id": "ZJzluaCkVZQZ"
      },
      "execution_count": 18,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "# Create dataloader\n",
        "dataloaderwiki = prepare_dataset(datawiki)"
      ],
      "metadata": {
        "id": "kmtBfvyjVb2p"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "## Hooks\n",
        "\n",
        "To be able to analyze the activations that occur in a layer, we need a mechanism that lets us \"spy\" on what happens inside the model.\n",
        "\n",
        "This mechanism is PyTorch Hooks, which let us hook a function to any module of the model and its input and output activations."
      ],
      "metadata": {
        "id": "KGTilesCi4rS"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Variable global para acumular normas\n",
        "_accumulated_act_norms = {}\n",
        "\n",
        "def setup_mlp_hooks_for_importance(model, device):\n",
        "    \"\"\"\n",
        "    Registra hooks en la entrada de down_proj (X_d) para calcular\n",
        "    la Norma L2 de cada neurona (||X_d^i||), según CFSP Ecuación 8.\n",
        "\n",
        "    Acumula las normas L2 a través de múltiples batches de calibración.\n",
        "\n",
        "    Returns:\n",
        "        handles: Lista de handles de hooks (para remover después)\n",
        "    \"\"\"\n",
        "    global _accumulated_act_norms\n",
        "    _accumulated_act_norms.clear()\n",
        "\n",
        "    # Liberar memoria antes de empezar\n",
        "    gc.collect()\n",
        "    torch.cuda.empty_cache()\n",
        "\n",
        "    handles = []\n",
        "\n",
        "    # Inicializar almacenamiento en CPU (ahorro de VRAM)\n",
        "    for idx, layer in enumerate(model.model.layers):\n",
        "        intermediate_size = layer.mlp.down_proj.in_features\n",
        "        _accumulated_act_norms[idx] = torch.zeros(\n",
        "            intermediate_size,\n",
        "            dtype=torch.float32,  # Explícito para consistencia\n",
        "            device='cpu'\n",
        "        )\n",
        "\n",
        "    def make_hook(layer_idx):\n",
        "        def hook(module, input, output):\n",
        "            \"\"\"\n",
        "            Captura X_d (entrada a down_proj) y calcula su Norma L2.\n",
        "\n",
        "            X_d shape: [batch_size, seq_len, intermediate_size]\n",
        "            Output: [intermediate_size] con ||X_d^i|| para cada neurona i\n",
        "            \"\"\"\n",
        "            X_d = input[0].detach()  # [B, S, I]\n",
        "\n",
        "            # --- CÁLCULO NORMA L2 (Ecuación 8 del paper) ---\n",
        "            # torch.norm con p=2 y dim=(0,1) hace:\n",
        "            # ||X_d^i|| = sqrt(sum_{b,s} X_d[b,s,i]²)\n",
        "            act_norms_L2 = torch.norm(\n",
        "                X_d.to(torch.float32),  # Asegurar precisión\n",
        "                p=2,\n",
        "                dim=(0, 1)  # Sumar sobre batch y secuencia\n",
        "            )  # Resultado: [intermediate_size]\n",
        "\n",
        "            # Acumular en CPU para ahorrar VRAM\n",
        "            _accumulated_act_norms[layer_idx] += act_norms_L2.cpu()\n",
        "\n",
        "        return hook\n",
        "\n",
        "    # Registrar hooks\n",
        "    for idx, layer in enumerate(model.model.layers):\n",
        "        handle = layer.mlp.down_proj.register_forward_hook(\n",
        "            make_hook(idx)\n",
        "        )\n",
        "        handles.append(handle)\n",
        "\n",
        "    print(f\"✓ Registrados {len(handles)} hooks en down_proj para capturar X_d\")\n",
        "\n",
        "    return handles"
      ],
      "metadata": {
        "id": "8p2pZSxXi6R1"
      },
      "execution_count": 20,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "def get_activation_norms():\n",
        "    \"\"\"\n",
        "    Retorna las normas L2 acumuladas en formato listo para usar.\n",
        "\n",
        "    Returns:\n",
        "        Dict[int, torch.Tensor]: {layer_idx: normas_L2 [intermediate_size]}\n",
        "    \"\"\"\n",
        "    return {\n",
        "        layer_idx: norms.clone()  # Clone para evitar modificaciones\n",
        "        for layer_idx, norms in _accumulated_act_norms.items()\n",
        "    }"
      ],
      "metadata": {
        "id": "-cnY8MOwi2UX"
      },
      "execution_count": 21,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "CK9NwmBWnkSP"
      },
      "source": [
        "#Pruning the Model.\n",
        "##Support pruning functions.\n",
        "###Compute neuron importance functions.\n",
        "\n",
        "\n",
        "\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 22,
      "metadata": {
        "id": "Seyqaquj7mQA"
      },
      "outputs": [],
      "source": [
        "def compute_neuron_pair_importance(gate_weight, up_weight, down_weight, X_d_norm):\n",
        "    \"\"\"\n",
        "    Computes neuron pair importance scores using CFSP methodology (Equation 8).\n",
        "\n",
        "    Paper: \"CFSP: An Efficient Structured Pruning Framework for LLMs with\n",
        "            Coarse-to-Fine Activation Information\" (arXiv:2409.13199v2)\n",
        "\n",
        "    Equation 8:\n",
        "    F_i^l = Σ_j ( |W_d^ij · ||X_d^i|| / ||W_d^*j · ||X_d^*||| + |W_u^ij| / ||W_u^i*|| + |W_g^ij| / ||W_g^i*|| )\n",
        "\n",
        "    Where:\n",
        "    - W_d: down_proj weights [hidden_size, intermediate_size]\n",
        "    - W_u: up_proj weights [intermediate_size, hidden_size]\n",
        "    - W_g: gate_proj weights [intermediate_size, hidden_size]\n",
        "    - X_d: Activations at down_proj input (SiLU(gate) ⊙ up)\n",
        "    - ||X_d^i||: L2 norm of neuron i across batch and sequence\n",
        "    - W_d^*j: Σ_i |W_d^ij| (sum of weight magnitudes in column j)\n",
        "    - ||X_d^*||: Σ_i ||X_d^i|| (sum of all activation norms)\n",
        "\n",
        "    Args:\n",
        "        gate_weight: Tensor [intermediate_size, hidden_size] from gate_proj.weight\n",
        "        up_weight: Tensor [intermediate_size, hidden_size] from up_proj.weight\n",
        "        down_weight: Tensor [hidden_size, intermediate_size] from down_proj.weight\n",
        "        X_d_norm: Tensor [intermediate_size] with accumulated L2 norms ||X_d^i||\n",
        "\n",
        "    Returns:\n",
        "        importance_scores: Tensor [intermediate_size] with importance score per neuron pair\n",
        "    \"\"\"\n",
        "    device = gate_weight.device\n",
        "    intermediate_size = gate_weight.size(0)\n",
        "\n",
        "    # Move X_d_norm to same device and ensure float32\n",
        "    X_d_norm = X_d_norm.to(device).to(torch.float32)\n",
        "\n",
        "    # Convert weights to float32 for numerical stability\n",
        "    gate_weight = gate_weight.float()\n",
        "    up_weight = up_weight.float()\n",
        "    down_weight = down_weight.float()\n",
        "\n",
        "    # ============================================================================\n",
        "    # COMPONENT 1: down_proj with activations\n",
        "    # Term: |W_d^ij · ||X_d^i|| / (||W_d^*j|| · ||X_d^*||)\n",
        "    # ============================================================================\n",
        "\n",
        "    # Transpose down_weight: [hidden_size, intermediate_size] -> [intermediate_size, hidden_size]\n",
        "    W_d_t = down_weight.t()  # [intermediate_size, hidden_size]\n",
        "    W_d_abs = torch.abs(W_d_t)  # [intermediate_size, hidden_size]\n",
        "\n",
        "    # --- NUMERATOR: |W_d^ij| * ||X_d^i|| ---\n",
        "    # Element-wise product for each (i, j) pair\n",
        "    numerator = W_d_abs * X_d_norm.unsqueeze(1)  # [intermediate_size, hidden_size]\n",
        "\n",
        "    # --- DENOMINATOR: (Σ_i |W_d^ij|) * (Σ_i ||X_d^i||) ---\n",
        "    # Part 1: ||W_d^*j|| = Σ_i |W_d^ij| (sum over rows for each column j)\n",
        "    W_d_column_sums = W_d_abs.sum(dim=0, keepdim=True)  # [1, hidden_size]\n",
        "\n",
        "    # Part 2: ||X_d^*|| = Σ_i ||X_d^i|| (sum of all activation norms)\n",
        "    X_d_total_norm = X_d_norm.sum()  # Scalar\n",
        "\n",
        "    # Denominator: product of the two sums (broadcast to match numerator shape)\n",
        "    denominator = W_d_column_sums * X_d_total_norm  # [1, hidden_size]\n",
        "\n",
        "    # --- NORMALIZED TERM ---\n",
        "    # Divide numerator by denominator and sum over output dimension j\n",
        "    normalized_down = (numerator / (denominator + 1e-8)).sum(dim=1)  # [intermediate_size]\n",
        "\n",
        "    # ============================================================================\n",
        "    # COMPONENT 2: up_proj weights only\n",
        "    # Term: |W_u^ij| / ||W_u^i*||\n",
        "    # ============================================================================\n",
        "\n",
        "    # Take absolute values of up_proj weights\n",
        "    up_abs = torch.abs(up_weight)  # [intermediate_size, hidden_size]\n",
        "\n",
        "    # Sum over input dimension (rows): ||W_u^i*||\n",
        "    row_sums_up = up_abs.sum(dim=1, keepdim=True)  # [intermediate_size, 1]\n",
        "\n",
        "    # Normalize by row sum and sum over output dimension\n",
        "    normalized_up = (up_abs / (row_sums_up + 1e-8)).sum(dim=1)  # [intermediate_size]\n",
        "\n",
        "    # ============================================================================\n",
        "    # COMPONENT 3: gate_proj weights only\n",
        "    # Term: |W_g^ij| / ||W_g^i*||\n",
        "    # ============================================================================\n",
        "\n",
        "    # Take absolute values of gate_proj weights\n",
        "    gate_abs = torch.abs(gate_weight)  # [intermediate_size, hidden_size]\n",
        "\n",
        "    # Sum over input dimension (rows): ||W_g^i*||\n",
        "    row_sums_gate = gate_abs.sum(dim=1, keepdim=True)  # [intermediate_size, 1]\n",
        "\n",
        "    # Normalize by row sum and sum over output dimension\n",
        "    normalized_gate = (gate_abs / (row_sums_gate + 1e-8)).sum(dim=1)  # [intermediate_size]\n",
        "\n",
        "    # ============================================================================\n",
        "    # FINAL IMPORTANCE SCORE (Equation 8)\n",
        "    # F_i^l = sum of all three components\n",
        "    # ============================================================================\n",
        "\n",
        "    importance_scores = normalized_down + normalized_up + normalized_gate\n",
        "\n",
        "    return importance_scores"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 23,
      "metadata": {
        "id": "NX9Boph94RWA"
      },
      "outputs": [],
      "source": [
        "def prune_neuron_pairs(mlp, prune_percent, X_d_norm, layer_idx):\n",
        "    \"\"\"\n",
        "    Prunes neuron pairs from MLP block using CFSP importance scores.\n",
        "\n",
        "    Reduces dimensions of gate_proj, up_proj, and down_proj layers by removing\n",
        "    the least important neuron pairs based on data-driven activation analysis.\n",
        "\n",
        "    Args:\n",
        "        mlp: LlamaMLP module to prune\n",
        "        prune_percent: Fraction of neurons to remove (e.g., 0.2 for 20%)\n",
        "        X_d_norm: Tensor [intermediate_size] with accumulated L2 norms ||X_d^i||\n",
        "                  from calibration forward passes\n",
        "        layer_idx: Layer index (for logging/debugging purposes)\n",
        "\n",
        "    Returns:\n",
        "        new_gate_proj: Pruned gate_proj layer\n",
        "        new_up_proj: Pruned up_proj layer\n",
        "        new_down_proj: Pruned down_proj layer\n",
        "        k: New intermediate size after pruning\n",
        "    \"\"\"\n",
        "\n",
        "    # ============================================================================\n",
        "    # STEP 1: Extract weights from original layers\n",
        "    # ============================================================================\n",
        "\n",
        "    gate_weight = mlp.gate_proj.weight.data  # [intermediate_size, hidden_size]\n",
        "    up_weight = mlp.up_proj.weight.data      # [intermediate_size, hidden_size]\n",
        "    down_weight = mlp.down_proj.weight.data  # [hidden_size, intermediate_size]\n",
        "\n",
        "    original_intermediate_size = gate_weight.size(0)\n",
        "\n",
        "    # ============================================================================\n",
        "    # STEP 2: Compute importance scores using CFSP method\n",
        "    # ============================================================================\n",
        "\n",
        "    importance_scores = compute_neuron_pair_importance(\n",
        "        gate_weight=gate_weight,\n",
        "        up_weight=up_weight,\n",
        "        down_weight=down_weight,\n",
        "        X_d_norm=X_d_norm\n",
        "    )\n",
        "\n",
        "    # ============================================================================\n",
        "    # STEP 3: Determine how many neurons to keep\n",
        "    # ============================================================================\n",
        "\n",
        "    # Calculate number of neurons to prune\n",
        "    num_to_prune = min(\n",
        "        int(prune_percent * original_intermediate_size),\n",
        "        original_intermediate_size - 1  # Must keep at least 1 neuron\n",
        "    )\n",
        "\n",
        "    # Calculate number of neurons to keep\n",
        "    k = original_intermediate_size - num_to_prune\n",
        "\n",
        "    # Safety check\n",
        "    if k <= 0:\n",
        "        raise ValueError(\n",
        "            f\"Layer {layer_idx}: Invalid number of neurons to keep: {k}. \"\n",
        "            f\"Original size: {original_intermediate_size}, prune_percent: {prune_percent}\"\n",
        "        )\n",
        "\n",
        "    # ============================================================================\n",
        "    # STEP 4: Select top-k most important neuron pairs\n",
        "    # ============================================================================\n",
        "\n",
        "    # Get indices of top-k neurons by importance score\n",
        "    _, indices_to_keep = torch.topk(\n",
        "        importance_scores,\n",
        "        k,\n",
        "        largest=True,   # Keep neurons with highest importance\n",
        "        sorted=True     # Sort for reproducibility\n",
        "    )\n",
        "\n",
        "    # Sort indices in ascending order (maintains original ordering)\n",
        "    indices_to_keep = indices_to_keep.sort().values\n",
        "\n",
        "    # ============================================================================\n",
        "    # STEP 5: Create new pruned layers\n",
        "    # ============================================================================\n",
        "\n",
        "    device = gate_weight.device\n",
        "\n",
        "    # Create new layers with reduced intermediate dimension\n",
        "    new_gate_proj = nn.Linear(\n",
        "        mlp.gate_proj.in_features,   # hidden_size (unchanged)\n",
        "        k,                             # New intermediate_size\n",
        "        bias=False\n",
        "    ).to(device)\n",
        "\n",
        "    new_up_proj = nn.Linear(\n",
        "        mlp.up_proj.in_features,     # hidden_size (unchanged)\n",
        "        k,                             # New intermediate_size\n",
        "        bias=False\n",
        "    ).to(device)\n",
        "\n",
        "    new_down_proj = nn.Linear(\n",
        "        k,                             # New intermediate_size\n",
        "        mlp.down_proj.out_features,  # hidden_size (unchanged)\n",
        "        bias=False\n",
        "    ).to(device)\n",
        "\n",
        "    # ============================================================================\n",
        "    # STEP 6: Copy weights for kept neurons\n",
        "    # ============================================================================\n",
        "\n",
        "    # For gate_proj and up_proj: keep rows (output dimension)\n",
        "    new_gate_proj.weight.data = gate_weight[indices_to_keep, :]\n",
        "    new_up_proj.weight.data = up_weight[indices_to_keep, :]\n",
        "\n",
        "    # For down_proj: keep columns (input dimension)\n",
        "    new_down_proj.weight.data = down_weight[:, indices_to_keep]\n",
        "\n",
        "    # ============================================================================\n",
        "    # STEP 7: Return pruned layers and new size\n",
        "    # ============================================================================\n",
        "\n",
        "    return new_gate_proj, new_up_proj, new_down_proj, k\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "QT0v_RpeST87"
      },
      "source": [
        "# Prune Loop\n",
        "The update_model function iterates through the blocks within the model's Transformer structure. This structure consists of multiple `LlamaDecoderLayer` blocks, and each of these blocks contains a pair of `LlamaSdpaAttention` and `LlamaMLP` components. The latter contains the MLP layers that will be the target of the pruning process.\n",
        "```\n",
        "(layers): ModuleList(\n",
        "      (0-15): 16 x LlamaDecoderLayer(\n",
        "        (self_attn): LlamaSdpaAttention(\n",
        "          (q_proj): Linear(in_features=2048, out_features=2048, bias=False)\n",
        "          (k_proj): Linear(in_features=2048, out_features=512, bias=False)\n",
        "          (v_proj): Linear(in_features=2048, out_features=512, bias=False)\n",
        "          (o_proj): Linear(in_features=2048, out_features=2048, bias=False)\n",
        "          (rotary_emb): LlamaRotaryEmbedding()\n",
        "        )\n",
        "        (mlp): LlamaMLP(\n",
        "          (gate_proj): Linear(in_features=2048, out_features=8192, bias=False)\n",
        "          (up_proj): Linear(in_features=2048, out_features=8192, bias=False)\n",
        "          (down_proj): Linear(in_features=8192, out_features=2048, bias=False)\n",
        "          (act_fn): SiLU()\n",
        "        )\n",
        "        (input_layernorm): LlamaRMSNorm((2048,), eps=1e-05)\n",
        "        (post_attention_layernorm): LlamaRMSNorm((2048,), eps=1e-05)\n",
        "      )\n",
        "  )    \n",
        "```\n",
        "The layers that will undergo the removal of neurons identified as less useful are:\n",
        "```\n",
        "(gate_proj): Linear(in_features=2048, out_features=8192, bias=False)\n",
        "(up_proj): Linear(in_features=2048, out_features=8192, bias=False)\n",
        "(down_proj): Linear(in_features=8192, out_features=2048, bias=False)\n",
        "```\n",
        "The neurons are removed in the `prune_neurons` function based on the values returned by `compute_neuron_pair_importance`."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 24,
      "metadata": {
        "id": "FxJEWg1X3j0m"
      },
      "outputs": [],
      "source": [
        "def update_model(model, prune_percent, activation_norms):\n",
        "    \"\"\"\n",
        "    Applies pruning to all MLP layers in the model using CFSP method.\n",
        "\n",
        "    Iterates through each transformer layer and prunes its MLP block based on\n",
        "    data-driven importance scores computed from calibration activations.\n",
        "\n",
        "    Args:\n",
        "        model: LlamaForCausalLM model to prune\n",
        "        prune_percent: Fraction of neurons to remove (e.g., 0.2 for 20%)\n",
        "        activation_norms: Dict mapping layer_idx -> X_d_norm tensor\n",
        "                          Example: {0: tensor([...]), 1: tensor([...]), ...}\n",
        "\n",
        "    Returns:\n",
        "        model: Pruned model with updated layers and config\n",
        "    \"\"\"\n",
        "\n",
        "    new_intermediate_size = None\n",
        "    pruning_stats = []\n",
        "\n",
        "    print(f\"\\n{'='*60}\")\n",
        "    print(f\"Starting pruning with {prune_percent*100:.1f}% width pruning\")\n",
        "    print(f\"{'='*60}\\n\")\n",
        "\n",
        "    # ============================================================================\n",
        "    # Prune each MLP layer\n",
        "    # ============================================================================\n",
        "\n",
        "    for idx, layer in enumerate(model.model.layers):\n",
        "        # Get MLP module\n",
        "        mlp = layer.mlp\n",
        "\n",
        "        # Get activation norms for this layer\n",
        "        if idx not in activation_norms:\n",
        "            raise KeyError(\n",
        "                f\"No activation norms found for layer {idx}. \"\n",
        "                f\"Available layers: {list(activation_norms.keys())}\"\n",
        "            )\n",
        "\n",
        "        X_d_norm = activation_norms[idx]\n",
        "\n",
        "        # Store original size\n",
        "        original_size = mlp.gate_proj.out_features\n",
        "\n",
        "        # Prune the neuron pairs\n",
        "        new_gate_proj, new_up_proj, new_down_proj, new_size = prune_neuron_pairs(\n",
        "            mlp=mlp,\n",
        "            prune_percent=prune_percent,\n",
        "            X_d_norm=X_d_norm,\n",
        "            layer_idx=idx\n",
        "        )\n",
        "\n",
        "        # Replace layers in model\n",
        "        mlp.gate_proj = new_gate_proj\n",
        "        mlp.up_proj = new_up_proj\n",
        "        mlp.down_proj = new_down_proj\n",
        "\n",
        "        # Store statistics\n",
        "        pruning_stats.append({\n",
        "            'layer': idx,\n",
        "            'original_size': original_size,\n",
        "            'new_size': new_size,\n",
        "            'pruned': original_size - new_size,\n",
        "            'kept_percent': (new_size / original_size) * 100\n",
        "        })\n",
        "\n",
        "        # Set new_intermediate_size (same for all layers)\n",
        "        if new_intermediate_size is None:\n",
        "            new_intermediate_size = new_size\n",
        "\n",
        "        # Progress indicator\n",
        "        if (idx + 1) % 4 == 0:\n",
        "            print(f\"  Pruned layers {idx-3:2d}-{idx:2d}: \"\n",
        "                  f\"{original_size} → {new_size} neurons \"\n",
        "                  f\"({(new_size/original_size)*100:.1f}% kept)\")\n",
        "\n",
        "    # ============================================================================\n",
        "    # Update model configuration\n",
        "    # ============================================================================\n",
        "\n",
        "    model.config.intermediate_size = new_intermediate_size\n",
        "\n",
        "    # ============================================================================\n",
        "    # Print summary statistics\n",
        "    # ============================================================================\n",
        "\n",
        "    print(f\"\\n{'='*60}\")\n",
        "    print(f\"Pruning completed!\")\n",
        "    print(f\"{'='*60}\")\n",
        "    print(f\"  Layers pruned: {len(pruning_stats)}\")\n",
        "    print(f\"  Original intermediate size: {original_size}\")\n",
        "    print(f\"  New intermediate size: {new_intermediate_size}\")\n",
        "    print(f\"  Neurons pruned per layer: {original_size - new_intermediate_size}\")\n",
        "    print(f\"  Effective width pruning: {((original_size - new_intermediate_size) / original_size) * 100:.2f}%\")\n",
        "    print(f\"{'='*60}\\n\")\n",
        "\n",
        "    return model\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "KtHtSbRmS267"
      },
      "source": [
        "## Obtain & test the pruned model."
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "# Step 1: Setup hooks to capture activations\n",
        "print(\"Setting up activation hooks...\")\n",
        "handles = setup_mlp_hooks_for_importance(model, device)\n",
        "\n",
        "# Step 2: Run calibration forward passes\n",
        "print(\"=\"*60)\n",
        "print(\"RUNNING CALIBRATION FORWARD PASSES\")\n",
        "print(\"=\"*60)\n",
        "\n",
        "model.eval()  # Set to evaluation mode\n",
        "\n",
        "with torch.no_grad():\n",
        "    for batch_idx, batch in enumerate(tqdm(dataloaderwiki, desc=\"Calibration\")):\n",
        "        # Move batch to device\n",
        "        # Your DataLoader already returns 'input_ids' and 'attention_mask'\n",
        "        inputs = {\n",
        "            'input_ids': batch['input_ids'].to(device),\n",
        "            'attention_mask': batch['attention_mask'].to(device)\n",
        "        }\n",
        "\n",
        "        # Forward pass (hooks are triggered automatically)\n",
        "        outputs = model(**inputs)\n",
        "\n",
        "        # Optional: Clear cache periodically to avoid OOM\n",
        "        if (batch_idx + 1) % 10 == 0:\n",
        "            torch.cuda.empty_cache()\n",
        "\n",
        "print(f\"\\n✓ Processed {len(dataloaderwiki)} batches\")\n",
        "print()\n",
        "\n",
        "# Step 3: Clean up hooks\n",
        "print(\"Removing hooks...\")\n",
        "for handle in handles:\n",
        "    handle.remove()\n",
        "\n",
        "# Step 4: Get accumulated activation norms\n",
        "print(\"Extracting activation statistics...\")\n",
        "activation_norms = get_activation_norms()\n",
        "\n",
        "# Verify we have norms for all layers\n",
        "num_layers = len(model.model.layers)\n",
        "assert len(activation_norms) == num_layers, \\\n",
        "    f\"Expected norms for {num_layers} layers, got {len(activation_norms)}\"\n",
        "\n",
        "print(f\"✓ Collected activation norms for {num_layers} layers\")\n"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "NyV0nNxI2WgU",
        "outputId": "017fa020-31a2-499e-8a69-d3854a57198c"
      },
      "execution_count": 25,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "Setting up activation hooks...\n",
            "✓ Registrados 16 hooks en down_proj para capturar X_d\n",
            "============================================================\n",
            "RUNNING CALIBRATION FORWARD PASSES\n",
            "============================================================\n"
          ]
        },
        {
          "output_type": "stream",
          "name": "stderr",
          "text": [
            "Calibration: 100%|██████████| 125/125 [02:29<00:00,  1.19s/it]"
          ]
        },
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "\n",
            "✓ Processed 125 batches\n",
            "\n",
            "Removing hooks...\n",
            "Extracting activation statistics...\n",
            "✓ Collected activation norms for 16 layers\n"
          ]
        },
        {
          "output_type": "stream",
          "name": "stderr",
          "text": [
            "\n"
          ]
        }
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 26,
      "metadata": {
        "id": "NIUnFU5R3n42",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "5ed32b7a-d858-4436-ff21-ca5e73a643c2"
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "\n",
            "============================================================\n",
            "Starting pruning with 20.0% width pruning\n",
            "============================================================\n",
            "\n",
            "  Pruned layers  0- 3: 8192 → 6554 neurons (80.0% kept)\n",
            "  Pruned layers  4- 7: 8192 → 6554 neurons (80.0% kept)\n",
            "  Pruned layers  8-11: 8192 → 6554 neurons (80.0% kept)\n",
            "  Pruned layers 12-15: 8192 → 6554 neurons (80.0% kept)\n",
            "\n",
            "============================================================\n",
            "Pruning completed!\n",
            "============================================================\n",
            "  Layers pruned: 16\n",
            "  Original intermediate size: 8192\n",
            "  New intermediate size: 6554\n",
            "  Neurons pruned per layer: 1638\n",
            "  Effective width pruning: 20.00%\n",
            "============================================================\n",
            "\n"
          ]
        }
      ],
      "source": [
        "prune_percent = 0.2  # Prune 20% of neurons\n",
        "model = update_model(model, prune_percent, activation_norms)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 27,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "tdJUkfWI3qMM",
        "outputId": "b4147689-fbda-46da-8979-2381ba233a42"
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "Pruned model parameters: 1074792448\n",
            "Reduction in parameters: 161021952\n",
            "Percentage of weight savings: 13.03%\n"
          ]
        }
      ],
      "source": [
        "# Recalculate the number of parameters\n",
        "pruned_param_count = count_parameters(model)\n",
        "reduction_in_params = original_param_count - pruned_param_count\n",
        "percentage_savings = (reduction_in_params / original_param_count) * 100\n",
        "\n",
        "print(f\"Pruned model parameters: {pruned_param_count}\")\n",
        "print(f\"Reduction in parameters: {reduction_in_params}\")\n",
        "print(f\"Percentage of weight savings: {percentage_savings:.2f}%\")\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 28,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "wvj-iIsO5M6U",
        "outputId": "3774f6e8-fe6a-40b1-d1d4-a33f9d8b5770"
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "Generated text after pruning: Paris is the capital of the French region of Île-de-France. It is located on the Seine River in the north-east of Paris. The city has a population of 1.6 million people, making it the largest city in\n"
          ]
        }
      ],
      "source": [
        "# Test the pruned model\n",
        "generated = get_output(prompt, model, tokenizer)\n",
        "print(f\"Generated text after pruning: {generated}\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "JGzXMQrVTULv"
      },
      "source": [
        "The result is slightly different from what the original model produced, but it’s still a fairly accurate response.\n",
        "\n",
        "In contrast to the model created in notebook: [6_2_pruning_structured_llama3.2-1b_KO.ipynb](https://github.com/peremartra/Large-Language-Model-Notebooks-Course/blob/main/6_2_pruning_structured_llama3.2-1b_KO.ipynb) where the pruned Llama model lost almost all its utility, the model in this notebook retains a good portion of its knowledge."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "dDQrSrf-VCyI"
      },
      "source": [
        "Looking at the model’s new structure, we can see that the `gate_proj` and `up_proj` layers have had their `out_features` reduced to 6554 from 8192. Consequently, the `down_proj` layer has its `in_features` adjusted to match the new size."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 29,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "ATAiqZW30NYN",
        "outputId": "bd091ba9-2eed-4f35-83f4-975e2351aed7"
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "LlamaForCausalLM(\n",
            "  (model): LlamaModel(\n",
            "    (embed_tokens): Embedding(128256, 2048)\n",
            "    (layers): ModuleList(\n",
            "      (0-15): 16 x LlamaDecoderLayer(\n",
            "        (self_attn): LlamaAttention(\n",
            "          (q_proj): Linear(in_features=2048, out_features=2048, bias=False)\n",
            "          (k_proj): Linear(in_features=2048, out_features=512, bias=False)\n",
            "          (v_proj): Linear(in_features=2048, out_features=512, bias=False)\n",
            "          (o_proj): Linear(in_features=2048, out_features=2048, bias=False)\n",
            "        )\n",
            "        (mlp): LlamaMLP(\n",
            "          (gate_proj): Linear(in_features=2048, out_features=6554, bias=False)\n",
            "          (up_proj): Linear(in_features=2048, out_features=6554, bias=False)\n",
            "          (down_proj): Linear(in_features=6554, out_features=2048, bias=False)\n",
            "          (act_fn): SiLUActivation()\n",
            "        )\n",
            "        (input_layernorm): LlamaRMSNorm((2048,), eps=1e-05)\n",
            "        (post_attention_layernorm): LlamaRMSNorm((2048,), eps=1e-05)\n",
            "      )\n",
            "    )\n",
            "    (norm): LlamaRMSNorm((2048,), eps=1e-05)\n",
            "    (rotary_emb): LlamaRotaryEmbedding()\n",
            "  )\n",
            "  (lm_head): Linear(in_features=2048, out_features=128256, bias=False)\n",
            ")\n"
          ]
        }
      ],
      "source": [
        "print(model)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "q6qEmvooZycx"
      },
      "source": [
        "#Upload the model to HuggingFace."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 30,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "S2Ll_kqe5QzO",
        "outputId": "fdadff6c-3e85-4026-a7c3-f270eb7d8599"
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "Pruned model saved to ./pruned20-llama-1b-db\n"
          ]
        }
      ],
      "source": [
        "new_model_name = 'pruned20-llama-1b-db'\n",
        "output_dir = './'+new_model_name\n",
        "if not os.path.exists(output_dir):\n",
        "    os.makedirs(output_dir)\n",
        "\n",
        "model.save_pretrained(output_dir)\n",
        "tokenizer.save_pretrained(output_dir)\n",
        "print(f\"Pruned model saved to {output_dir}\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "3LjjsGZV5ZHJ"
      },
      "outputs": [],
      "source": [
        "# Push the model to your Hugging Face repository\n",
        "\n",
        "model.push_to_hub(new_model_name, private=True)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "6xNN-aYa5h9B"
      },
      "outputs": [],
      "source": [
        "tokenizer.push_to_hub(new_model_name)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "XdKFR5Ju23kI"
      },
      "source": [
        "#Evaluating models"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "_UM2pkqFAYEe"
      },
      "source": [
        "In this section, we'll take a look at some standard evaluations in the world of Large Language Models using the lm-evaluation library from EleutherAI.\n",
        "\n",
        "Specifically, we'll use LAMBADA and BoolQ. Since the pruning performed could be considered structural—that is, it affects the model's overall structure without a specific target—I’ve chosen two rather different evaluation tasks.\n",
        "\n",
        "I want to remind you that the goal of this notebook is to demonstrate the pruning process, so I won’t be doing a comprehensive study of how it impacts performance; that will be saved for a future article. Additionally, these models are designed to be fine-tuned before being used.\n",
        "\n",
        "However, I believe that seeing how pruning impacts model performance can help illustrate the pruning process itself."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 33,
      "metadata": {
        "id": "XPOg8Hoa22xA",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "b7fc27e3-7639-4f1f-c669-35e27ef06ea1"
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m53.6/53.6 kB\u001b[0m \u001b[31m2.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25h  Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m51.8/51.8 kB\u001b[0m \u001b[31m4.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25h  Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "  Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m7.5/7.5 MB\u001b[0m \u001b[31m21.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m491.5/491.5 kB\u001b[0m \u001b[31m38.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m84.1/84.1 kB\u001b[0m \u001b[31m7.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m293.6/293.6 kB\u001b[0m \u001b[31m27.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m104.1/104.1 kB\u001b[0m \u001b[31m11.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m91.1/91.1 kB\u001b[0m \u001b[31m6.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25h  Building wheel for rouge-score (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "  Building wheel for sqlitedict (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "  Building wheel for word2number (setup.py) ... \u001b[?25l\u001b[?25hdone\n"
          ]
        }
      ],
      "source": [
        "!pip install -q lm-eval\n",
        "from lm_eval import evaluator, tasks, models"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 34,
      "metadata": {
        "id": "N5wp8lM63IGz"
      },
      "outputs": [],
      "source": [
        "def evaluate_hf_model(model_name, tasks=['arc_easy'], num_fewshot=0):\n",
        "    \"\"\"\n",
        "    It calls the evaluator to evaluate a model available on Hugging Face.\n",
        "\n",
        "    Args:\n",
        "    - model_name: The model name in hugging Face.\n",
        "    - tasks: Tasks to evaluate.\n",
        "    - num_fewshot: Number of examples of few-shot learning\n",
        "\n",
        "    Returns:\n",
        "    - metrics.\n",
        "    \"\"\"\n",
        "    model_args = f\"pretrained={model_name},device=cuda\"\n",
        "    tasks = tasks\n",
        "\n",
        "    results = evaluator.simple_evaluate(\n",
        "      model=\"hf\",\n",
        "      model_args=model_args,\n",
        "      tasks=tasks,\n",
        "      num_fewshot=0,  # Number of few-shot smaples.\n",
        "      limit=None,  # Use all the samples in the Evaluate Dataset.\n",
        "      bootstrap_iters=10\n",
        "    )\n",
        "\n",
        "    metrics = results.get('results', {})\n",
        "    return metrics"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 35,
      "metadata": {
        "id": "yZm8VvA33Nh6"
      },
      "outputs": [],
      "source": [
        "# Select tasks to evaluate.\n",
        "tasks = ['lambada', 'boolq']"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "kTW3E3mp145e"
      },
      "outputs": [],
      "source": [
        "metrics_base = evaluate_hf_model(\"meta-llama/Llama-3.2-1B\", tasks=tasks)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 37,
      "metadata": {
        "id": "w-g3vyPN3VZp",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "4327819a-5edc-404c-9eac-718cefe829cc"
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "{'boolq': {'alias': 'boolq',\n",
              "  'acc,none': 0.637308868501529,\n",
              "  'acc_stderr,none': 0.008408838061823002},\n",
              " 'lambada_openai': {'alias': 'lambada_openai',\n",
              "  'perplexity,none': 5.747471606969041,\n",
              "  'perplexity_stderr,none': 0.19350717486069613,\n",
              "  'acc,none': 0.6198331069280031,\n",
              "  'acc_stderr,none': 0.006762956659647906},\n",
              " 'lambada_standard': {'alias': 'lambada_standard',\n",
              "  'perplexity,none': 8.673077754353926,\n",
              "  'perplexity_stderr,none': 0.3809304515805616,\n",
              "  'acc,none': 0.5315350281389482,\n",
              "  'acc_stderr,none': 0.006952109107344777}}"
            ]
          },
          "metadata": {},
          "execution_count": 37
        }
      ],
      "source": [
        "metrics_base"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 38,
      "metadata": {
        "id": "aN5KeIkQ15gM",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "13446f56-cb09-4d71-87b4-1caa07c6dc95"
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stderr",
          "text": [
            "WARNING:lm_eval.api.task:[Task: boolq] metric acc is defined, but aggregation is not. using default aggregation=mean\n",
            "WARNING:lm_eval.api.task:[Task: boolq] metric acc is defined, but higher_is_better is not. using default higher_is_better=True\n",
            "WARNING:lm_eval.evaluator:Overwriting default num_fewshot of boolq from None to 0\n",
            "WARNING:lm_eval.evaluator:Overwriting default num_fewshot of lambada_standard from None to 0\n",
            "WARNING:lm_eval.evaluator:Overwriting default num_fewshot of lambada_openai from None to 0\n",
            "100%|██████████| 3270/3270 [00:03<00:00, 1031.22it/s]\n",
            "100%|██████████| 5153/5153 [00:20<00:00, 253.65it/s]\n",
            "100%|██████████| 5153/5153 [00:15<00:00, 342.62it/s]\n",
            "Running loglikelihood requests: 100%|██████████| 16846/16846 [07:12<00:00, 38.96it/s]\n"
          ]
        },
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "bootstrapping for stddev: perplexity\n"
          ]
        },
        {
          "output_type": "stream",
          "name": "stderr",
          "text": [
            "100%|██████████| 1/1 [00:00<00:00, 110.20it/s]"
          ]
        },
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "bootstrapping for stddev: perplexity\n"
          ]
        },
        {
          "output_type": "stream",
          "name": "stderr",
          "text": [
            "\n",
            "100%|██████████| 1/1 [00:00<00:00, 94.87it/s]\n"
          ]
        }
      ],
      "source": [
        "metrics_pruned = evaluate_hf_model(\"pruned20-llama-1b-db\", tasks=tasks)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 39,
      "metadata": {
        "id": "bboB7uU39l_Z",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "4bc2849d-1a3c-4570-9cce-a313d4decf72"
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "{'boolq': {'alias': 'boolq',\n",
              "  'acc,none': 0.6226299694189602,\n",
              "  'acc_stderr,none': 0.008477957863310022},\n",
              " 'lambada_openai': {'alias': 'lambada_openai',\n",
              "  'perplexity,none': 15.401305269771163,\n",
              "  'perplexity_stderr,none': 0.6654464812246137,\n",
              "  'acc,none': 0.4451775664661362,\n",
              "  'acc_stderr,none': 0.006923978566470379},\n",
              " 'lambada_standard': {'alias': 'lambada_standard',\n",
              "  'perplexity,none': 28.310130469438295,\n",
              "  'perplexity_stderr,none': 1.6904404808392741,\n",
              "  'acc,none': 0.37046380749078206,\n",
              "  'acc_stderr,none': 0.006728144610304467}}"
            ]
          },
          "metadata": {},
          "execution_count": 39
        }
      ],
      "source": [
        "metrics_pruned"
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "import matplotlib.pyplot as plt\n",
        "import numpy as np\n",
        "\n",
        "def plot_model_comparison(metrics_base, metrics_pruned):\n",
        "\n",
        "    tasks_to_plot = ['boolq', 'lambada_openai', 'lambada_standard']\n",
        "    display_labels = ['BoolQ', 'Lambada OpenAI', 'Lambada Standard']\n",
        "\n",
        "    try:\n",
        "        base_scores = [metrics_base[task]['acc,none'] for task in tasks_to_plot]\n",
        "        pruned_scores = [metrics_pruned[task]['acc,none'] for task in tasks_to_plot]\n",
        "    except KeyError as e:\n",
        "        return\n",
        "\n",
        "    n_groups = len(tasks_to_plot)\n",
        "    index = np.arange(n_groups)\n",
        "    bar_width = 0.35\n",
        "\n",
        "    fig, ax = plt.subplots(figsize=(10, 6))\n",
        "\n",
        "    color_base = '#3366CC'\n",
        "    color_pruned = '#DC3912'\n",
        "\n",
        "    ax.bar(index - bar_width / 2,\n",
        "           base_scores,\n",
        "           bar_width,\n",
        "           label='Base Model',\n",
        "           color=color_base)\n",
        "\n",
        "    ax.bar(index + bar_width / 2,\n",
        "           pruned_scores,\n",
        "           bar_width,\n",
        "           label='Pruned Model',\n",
        "           color=color_pruned)\n",
        "\n",
        "    ax.set_ylim([0, 0.9])\n",
        "    ax.set_yticks(np.arange(0, 0.9, 0.2))\n",
        "    ax.tick_params(axis='both', which='major', labelsize=12)\n",
        "\n",
        "    ax.set_xticks(index)\n",
        "    ax.set_xticklabels(display_labels)\n",
        "\n",
        "    ax.yaxis.grid(True, linestyle='-', linewidth=0.5, color='lightgray')\n",
        "    ax.set_axisbelow(True)\n",
        "\n",
        "    ax.legend(fontsize=12, loc='upper center', bbox_to_anchor=(0.5, 1.1),\n",
        "              ncol=2, frameon=False)\n",
        "\n",
        "    plt.tight_layout()\n",
        "    plt.show()"
      ],
      "metadata": {
        "id": "6jhx184oHcbO"
      },
      "execution_count": 40,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "plot_model_comparison(metrics_base, metrics_pruned)"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 611
        },
        "id": "qZQYe49vHerT",
        "outputId": "664a78e9-5779-4e72-8172-cc6fd42093cf"
      },
      "execution_count": 41,
      "outputs": [
        {
          "output_type": "display_data",
          "data": {
            "text/plain": [
              "<Figure size 1000x600 with 1 Axes>"
            ],
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAA94AAAJSCAYAAAA1aB08AAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjAsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvlHJYcgAAAAlwSFlzAAAPYQAAD2EBqD+naQAAQzNJREFUeJzt3XeYVdW5P/AXBpwZugooxAERVGyIQZnEqIioKHoRDViCQWxo7jVWoiGJUYyxYTBcjUZuEBBbFI1euyQUQQ1oBEuiUqQYRLFQFQaB/fvD35zrOJQ56HIAP5/nmSc5a6+197vPzFmeL7vVyLIsCwAAACCJmtVdAAAAAGzNBG8AAABISPAGAACAhARvAAAASEjwBgAAgIQEbwAAAEhI8AYAAICEBG8AAABISPAGAACAhARvAAAASEjwBgAAgIQEbwAAAEhI8AYAAICEBG8AAABISPAGAACAhARvAAAASEjwBoCvyfjx46NGjRoxfvz4vMeOGDEiatSoEXPmzPna62LrceWVV0aNGjWqu4z18hkAWDfBG4D1Kv8i/MWfpk2bRufOnePJJ5+s7vLWq2/fvlGjRo1o0KBBrFixotLyGTNm5PbnxhtvrIYKSe3Lf7tFRUWx2267xXnnnRfvv/9+dZeXnM8AwOalVnUXALA12/+cf1R3CfHS7R2+8jquuuqqaNWqVWRZFu+//36MGDEiunXrFo8++mgce+yxX0OVX79atWrFp59+Go8++miceOKJFZbdfffdUVRUFCtXrqym6rYMsw7cvrpLiNbPf/SVxpf/7a5cuTImTZoUt912WzzxxBPx+uuvR506db6mKjdPPgMAmw9HvAHYqKOPPjpOPfXU+PGPfxz9+/ePiRMnRu3atePee++t7tLWq7CwMLp06bLOGu+555445phjqqEqvmnlf7tnnXVWjBgxIi688MKYPXt2PPLII+sd88knn3yDFabjMwCw+RC8Achbo0aNori4OGrVqnji1I033hgHHnhgbL/99lFcXBwdOnSI0aNHVxo/ZsyYOOigg6JRo0ZRr1692H333eMXv/hFhT5lZWVxxRVXRJs2baKwsDBKSkri0ksvjbKysirX+aMf/SiefPLJWLx4ca7txRdfjBkzZsSPfvSjdY55++23o1evXrHddttFnTp14nvf+148/vjjlfr9+9//jh49ekTdunWjadOmcdFFF623tsmTJ8dRRx0VDRs2jDp16kSnTp3iueeeq/J+8PU57LDDIiJi9uzZEfH5Kdn16tWLWbNmRbdu3aJ+/frRu3fviIjYeeedo2/fvpXWceihh8ahhx6ae11+XfP9998fv/3tb2OnnXaKoqKi6NKlS8ycObPS+Kr+PUyaNCkOOOCAKCoqitatW8ftt9+e9/76DABsHpxqDsBGLVmyJD788MPIsiwWLlwYN998cyxfvjxOPfXUCv2GDBkS3bt3j969e8eqVavivvvui169esVjjz2WO7r2z3/+M4499tho165dXHXVVVFYWBgzZ86s8CV87dq10b1795g0aVL069cv9thjj3jttdfipptuiunTp8fDDz9cpbpPOOGEOPfcc+Ohhx6KM844IyI+P9LXtm3b+O53v1up//vvvx8HHnhgfPrpp3H++efH9ttvHyNHjozu3bvH6NGj4/jjj4+IiBUrVkSXLl1i3rx5cf7550fz5s1j1KhRMXbs2ErrHDt2bBx99NHRoUOHuOKKK6JmzZoxfPjwOOyww2LixInRsWPHKu0LX49Zs2ZFRMT22//fafSrV6+Orl27xkEHHRQ33njjJp+Cft1110XNmjWjf//+sWTJkrjhhhuid+/eMXny5Fyfqv49vPbaa3HkkUdGkyZN4sorr4zVq1fHFVdcETvssENeNfkMAGweBG8ANurwww+v8LqwsDDuuOOOOOKIIyq0T58+PYqLi3OvzzvvvPjud78bgwcPzgXvMWPGxKpVq+LJJ5+Mxo0br3N799xzT/z1r3+NCRMmxEEHHZRr33vvvePcc8+N559/Pg488MCN1l2/fv049thj45577okzzjgj1q5dG/fdd1/85Cc/WWf/6667Lt5///2YOHFibrtnn312tGvXLi6++OI47rjjombNmjF06NCYPn163H///dGrV69cv3333bfC+rIsi3PPPTd3M7ryu1Gfc845sddee8WvfvWreOaZZza6H2y68n80WrlyZTz33HNx1VVXRXFxcYV7E5SVlUWvXr3i2muv/UrbWrlyZUybNi222WabiIjYdttt44ILLojXX3899t5777z+Hn79619HlmUxceLEaNGiRURE/PCHP4x99tknr5p8BgA2D041B2Cj/vCHP8SYMWNizJgxcdddd0Xnzp3jrLPOioceeqhCvy+G7kWLFsWSJUvi4IMPjpdffjnX3qhRo4iIeOSRR2Lt2rXr3N4DDzwQe+yxR7Rt2zY+/PDD3E/5acLjxo2rcu0/+tGPYvz48fHee+/F2LFj47333lvvKbZPPPFEdOzYsULYr1evXvTr1y/mzJkT//rXv3L9mjVrFj179sz1q1OnTvTr16/C+qZNm5Y7pfejjz7K7ccnn3wSXbp0iWeffXa97wFfj8MPPzyaNGkSJSUlcfLJJ0e9evXiL3/5S3znO9+p0G99QTQfp59+ei50R0QcfPDBEfH5qdsRVf97WLNmTTz99NPRo0ePXOiOiNhjjz2ia9euedflMwBQ/RzxBmCjOnbsGPvvv3/u9SmnnBL77bdfnHfeeXHsscfmwsZjjz0WV199dUybNq3CtZ5ffO7wSSedFH/605/irLPOip///OfRpUuXOOGEE6Jnz55Rs+bn/x48Y8aMeOONN6JJkybrrGfhwoVVrr38ut0///nPMW3atDjggAOiTZs263xW8Ny5c6O0tLRS+x577JFbvvfee8fcuXOjTZs2lZ6nvPvuu1d4PWPGjIiIOO2009Zb35IlS2Lbbbet8v6Qnz/84Q+x2267Ra1atWKHHXaI3XffPfd3Vq5WrVqx0047feVtfTEkR0Tu97po0aKIqPrfQ1lZWaxYsSJ23XXXSst33333eOKJJ/Kqy2cAoPoJ3gDkrWbNmtG5c+cYMmRIzJgxI/baa6+YOHFidO/ePQ455JC49dZbo1mzZlG7du0YPnx43HPPPbmxxcXF8eyzz8a4cePi8ccfj6eeeir+/Oc/x2GHHRbPPPNMFBQUxNq1a2OfffaJwYMHr3P7JSUlVa61sLAwTjjhhBg5cmS8/fbbceWVV37V3a+y8iN5gwYNivbt26+zT7169b6xer6NvvyPRutSWFhYKYxHRKVQWW7NmjVRUFBQqX1dbRGfn24dUfW/h3xuIFgVPgMA1U/wBmCTrF69OiIili9fHhERDz74YBQVFcXTTz8dhYWFuX7Dhw+vNLZmzZrRpUuX6NKlSwwePDiuueaa+OUvfxnjxo2Lww8/PFq3bh2vvPJKdOnSZb3hJx8/+tGP4o477oiaNWvGySefvN5+LVu2jLfeeqtS+5tvvplbXv6/r7/+emRZVqG+L49t3bp1REQ0aNCg0nXybP623XbbCncDLzd37tzYZZdd8l5fVf8emjRpEsXFxbmjxV+0rr/PqvAZAKhervEGIG+fffZZPPPMM7HNNtvkTkEtKCiIGjVqxJo1a3L95syZU+kO5B9//HGl9ZUfCSs/0nfiiSfG/Pnz43/+538q9V2xYkXez1nu3Llz/OY3v4lbbrkldtxxx/X269atW0yZMiVeeOGFXNsnn3wSQ4cOjZ133jn23HPPXL933323wqPSPv300xg6dGiF9XXo0CFat24dN954Y+4fKL7ogw8+yGs/+Ga1bt06/v73v8eqVatybY899li88847m7S+qv49FBQURNeuXePhhx+OefPm5Za/8cYb8fTTT2/Stn0GAKqXI94AbNSTTz6ZO+K1cOHCuOeee2LGjBnx85//PBo0aBAREcccc0wMHjw4jjrqqPjRj34UCxcujD/84Q/Rpk2bePXVV3Pruuqqq+LZZ5+NY445Jlq2bBkLFy6MW2+9NXbaaafcDZ1+/OMfx/333x/nnntujBs3Ln7wgx/EmjVr4s0334z7778/nn766Y2ePvxFNWvWjF/96lcb7ffzn/887r333jj66KPj/PPPj+222y5GjhwZs2fPjgcffDB3OvLZZ58dt9xyS/Tp0yf+8Y9/RLNmzWLUqFGVHkNVs2bN+NOf/hRHH3107LXXXnH66afHd77znZg/f36MGzcuGjRoEI8++miV94Nv1llnnRWjR4+Oo446Kk488cSYNWtW3HXXXbmjuPnK5+9h4MCB8dRTT8XBBx8c//mf/xmrV6+Om2++Ofbaa68Kn6d8tu0zAFCNMgBYj+HDh2cRUeGnqKgoa9++fXbbbbdla9eurdB/2LBh2a677poVFhZmbdu2zYYPH55dccUV2Rf/c/O3v/0tO+6447LmzZtn22yzTda8efPslFNOyaZPn15hXatWrcquv/76bK+99soKCwuzbbfdNuvQoUM2cODAbMmSJRus+7TTTsvq1q27wT6zZ8/OIiIbNGhQhfZZs2ZlPXv2zBo1apQVFRVlHTt2zB577LFK4+fOnZt17949q1OnTta4cePsggsuyJ566qksIrJx48ZV6Dt16tTshBNOyLbffvussLAwa9myZXbiiSdmf/vb3yq917Nnz95g3VRN+fv54osvbrDfxv5Wfve732Xf+c53ssLCwuwHP/hB9tJLL2WdOnXKOnXqlOszbty4LCKyBx54oMLY8r+x4cOHV2ivyt9DlmXZhAkTsg4dOmTbbLNNtssuu2R//OMfK32eNnW/vlifzwBAejWy7P/f8QMAAAD42rnGGwAAABISvAEAACAhwRsAAAASErwBAAAgIcEbAAAAEhK8AQAAICHBGwAAABISvAEAACAhwRsAAAASErwBAAAgIcEbAAAAEhK8AQAAICHBGwAAABISvAEAACAhwRsAAAASErwBAAAgIcEbAAAAEhK8AQAAICHBGwAAABKqVd0FbIq1a9fGu+++G/Xr148aNWpUdzkAAAB8y2RZFsuWLYvmzZtHzZobPqa9RQbvd999N0pKSqq7DAAAAL7l3nnnndhpp5022GeLDN7169ePiM93sEGDBtVcDQAAAN82S5cujZKSklw+3ZAtMniXn17eoEEDwRsAAIBqU5XLn91cDQAAABISvAEAACAhwRsAAAASErwBAAAgIcEbAAAAEhK8AQAAICHBGwAAABISvAEAACAhwRsAAAASErwBAAAgIcEbAAAAEhK8AQAAICHBGwAAABISvAEAACAhwRsAAAASErwBAAAgIcEbAAAAEhK8AQAAICHBGwAAABISvAEAACAhwRsAAAASErwBAAAgIcEbAAAAEhK8AQAAICHBGwAAABISvAEAACAhwRsAAAASErwBAAAgIcEbAAAAEhK8AQAAICHBGwAAABISvAEAACAhwRsAAAASErwBAAAgIcEbAAAAEhK8AQAAICHBGwAAABISvAEAACAhwRsAAAASErwBAAAgIcEbAAAAEhK8AQAAICHBGwAAABISvAEAACAhwRsAAAASErwBAAAgIcEbAAAAEso7eJeVlcVll10WzZs3j+Li4igtLY0xY8ZUaexf//rX6Ny5czRu3DgaNWoUHTt2jFGjRuVdNAAAAGwp8g7effv2jcGDB0fv3r1jyJAhUVBQEN26dYtJkyZtcNz//u//xpFHHhmrVq2KK6+8Mn77299GcXFx9OnTJ2666aZN3gEAAADYnNXIsiyraucpU6ZEaWlpDBo0KPr37x8REStXroy99947mjZtGs8///x6xx555JHxz3/+M95+++0oLCyMiIjVq1dH27Zto27duvHKK69UueilS5dGw4YNY8mSJdGgQYMqjwMAAICvQz65NK8j3qNHj46CgoLo169frq2oqCjOPPPMeOGFF+Kdd97ZYFHbbrttLnRHRNSqVSsaN24cxcXF+ZQBAAAAW4y8gvfUqVNjt912q5TmO3bsGBER06ZNW+/YQw89NP75z3/G5ZdfHjNnzoxZs2bFb37zm3jppZfi0ksvzb9yAAAA2ALUyqfzggULolmzZpXay9vefffd9Y69/PLLY/bs2fHb3/42rr766oiIqFOnTjz44INx3HHHbXC7ZWVlUVZWlnu9dOnSfMoGAACAapNX8F6xYkWFU8XLFRUV5ZavT2FhYey2227Rs2fPOOGEE2LNmjUxdOjQOPXUU2PMmDHxve99b71jr7322hg4cGCl9nnz5kX9+vXz2QUAAAD4ypYtW1blvnkF7+Li4gpHnsutXLkyt3x9zjvvvPj73/8eL7/8ctSs+fkZ7ieeeGLstddeccEFF8TkyZPXO3bAgAFx8cUX514vXbo0SkpKokWLFm6uBgAAwDcunzOx87rGu1mzZrFgwYJK7eVtzZs3X+e4VatWxbBhw+KYY47Jhe6IiNq1a8fRRx8dL730UqxatWq92y0sLIwGDRpU+AEAAIAtQV7Bu3379jF9+vRKyb78aHX79u3XOe6jjz6K1atXx5o1ayot++yzz2Lt2rXrXAYAAABburyCd8+ePXPXZpcrKyuL4cOHR2lpaZSUlETE59dev/nmm7k+TZs2jUaNGsVf/vKXCke2ly9fHo8++mi0bdvWI8UAAADYKuV1jXdpaWn06tUrBgwYEAsXLow2bdrEyJEjY86cOTFs2LBcvz59+sSECRMiy7KIiCgoKIj+/fvHr371q/je974Xffr0iTVr1sSwYcPi3//+d9x1111f714BAADAZiKv4B0Rceedd8bll18eo0aNikWLFkW7du3isccei0MOOWSD4375y19Gq1atYsiQITFw4MAoKyuLdu3axejRo+OHP/zhJu8AAAAAbM5qZOWHpbcgS5cujYYNG8aSJUvcaA0AAIBvXD65NK9rvAEAAID8CN4AAACQkOANAAAACQneAAAAkJDgDQAAAAkJ3gAAAJCQ4A0AAAAJCd4AAACQkOANAAAACQneAAAAkJDgDQAAAAkJ3gAAAJCQ4A0AAAAJCd4AAACQkOANAAAACQneAAAAkJDgDQAAAAkJ3gAAAJCQ4A0AAAAJCd4AAACQkOANAAAACQneAAAAkJDgDQAAAAkJ3gAAAJCQ4A0AAAAJCd4AAACQkOANAAAACQneAAAAkJDgDQAAAAkJ3gAAAJCQ4A0AAAAJCd4AAACQkOANAAAACQneAAAAkJDgDQAAAAkJ3gAAAJCQ4A0AAAAJCd4AAACQkOANAAAACQneAAAAkJDgDQAAAAkJ3gAAAJCQ4A0AAAAJCd4AAACQkOANAAAACQneAAAAkJDgDQAAAAkJ3gAAAJCQ4A0AAAAJCd4AAACQkOANAAAACQneAAAAkJDgDQAAAAkJ3gAAAJCQ4A0AAAAJCd4AAACQkOANAAAACQneAAAAkJDgDQAAAAkJ3gAAAJCQ4A0AAAAJCd4AAACQkOANAAAACQneAAAAkJDgDQAAAAkJ3gAAAJCQ4A0AAAAJCd4AAACQkOANAAAACQneAAAAkJDgDQAAAAkJ3gAAAJCQ4A0AAAAJCd4AAACQkOANAAAACQneAAAAkJDgDQAAAAkJ3gAAAJCQ4A0AAAAJCd4AAACQkOANAAAACQneAAAAkJDgDQAAAAnVqu4Cvg32P+cf1V0Cm+Cl2ztUdwkAAMBWwBFvAAAASEjwBgAAgIQEbwAAAEhI8AYAAICEBG8AAABISPAGAACAhARvAAAASMhzvGE9Zh24fXWXwCZo/fxH1V0CAABU4Ig3AAAAJOSINwBf2f7n/KO6S2ATvHR7h+ouAQC+FRzxBgAAgIQEbwAAAEhI8AYAAICEBG8AAABIKO/gXVZWFpdddlk0b948iouLo7S0NMaMGVPl8X/+85/j+9//ftStWzcaNWoUBx54YIwdOzbfMgAAAGCLkHfw7tu3bwwePDh69+4dQ4YMiYKCgujWrVtMmjRpo2OvvPLKOOWUU6KkpCQGDx4cV199dbRr1y7mz5+/ScUDAADA5i6vx4lNmTIl7rvvvhg0aFD0798/IiL69OkTe++9d1x66aXx/PPPr3fs3//+97jqqqvid7/7XVx00UVfrWoAAADYQuR1xHv06NFRUFAQ/fr1y7UVFRXFmWeeGS+88EK888476x37+9//Pnbccce44IILIsuyWL58+aZXDQAAAFuIvIL31KlTY7fddosGDRpUaO/YsWNEREybNm29Y//2t7/FAQccEP/93/8dTZo0ifr160ezZs3illtuyb9qAAAA2ELkdar5ggULolmzZpXay9vefffddY5btGhRfPjhh/Hcc8/F2LFj44orrogWLVrE8OHD46c//WnUrl07zjnnnPVut6ysLMrKynKvly5dmk/ZAAAAUG3yCt4rVqyIwsLCSu1FRUW55etSflr5Rx99FPfdd1+cdNJJERHRs2fP2GeffeLqq6/eYPC+9tprY+DAgZXa582bF/Xr189nF4Ct3Ny5c6u7BNhi+LwAwKZbtmxZlfvmFbyLi4srHHkut3Llytzy9Y2LiKhdu3b07Nkz116zZs046aST4oorroh58+ZFixYt1jl+wIABcfHFF+deL126NEpKSqJFixaVTnvfPH1Y3QXAt0bLli2ru4RvKfPclsjnBQA2XT5nYucVvJs1a7bOR38tWLAgIiKaN2++znHbbbddFBUVRaNGjaKgoKDCsqZNm0bE56ejry94FxYWrvNIOwAAAGzu8rq5Wvv27WP69OmVkv3kyZNzy9e5kZo1o3379vHBBx/EqlWrKiwrvy68SZMm+ZQCAAAAW4S8gnfPnj1jzZo1MXTo0FxbWVlZDB8+PEpLS6OkpCQiPr/2+s0336ww9qSTToo1a9bEyJEjc20rV66Mu+++O/bcc8/1Hi0HAACALVlep5qXlpZGr169YsCAAbFw4cJo06ZNjBw5MubMmRPDhg3L9evTp09MmDAhsizLtZ1zzjnxpz/9Kf7rv/4rpk+fHi1atIhRo0bF3Llz49FHH/369ggAAAA2I3kF74iIO++8My6//PIYNWpULFq0KNq1axePPfZYHHLIIRscV1xcHGPHjo1LL7007rjjjvjkk0+iffv28fjjj0fXrl03eQcAAABgc5Z38C4qKopBgwbFoEGD1ttn/Pjx62xv2rRpjBgxIt9NAgAAwBYrr2u8AQAAgPwI3gAAAJCQ4A0AAAAJCd4AAACQkOANAAAACQneAAAAkJDgDQAAAAkJ3gAAAJCQ4A0AAAAJCd4AAACQkOANAAAACQneAAAAkJDgDQAAAAkJ3gAAAJCQ4A0AAAAJCd4AAACQkOANAAAACQneAAAAkJDgDQAAAAkJ3gAAAJCQ4A0AAAAJCd4AAACQkOANAAAACQneAAAAkJDgDQAAAAkJ3gAAAJCQ4A0AAAAJCd4AAACQkOANAAAACQneAAAAkFCt6i4AAAA2d/uf84/qLoFN8NLtHaq7BIgIR7wBAAAgKcEbAAAAEhK8AQAAICHBGwAAABISvAEAACAhwRsAAAASErwBAAAgIcEbAAAAEhK8AQAAICHBGwAAABISvAEAACAhwRsAAAASErwBAAAgIcEbAAAAEhK8AQAAICHBGwAAABISvAEAACAhwRsAAAASErwBAAAgIcEbAAAAEhK8AQAAICHBGwAAABISvAEAACAhwRsAAAASErwBAAAgIcEbAAAAEhK8AQAAICHBGwAAABISvAEAACAhwRsAAAASErwBAAAgIcEbAAAAEhK8AQAAICHBGwAAABISvAEAACAhwRsAAAASErwBAAAgIcEbAAAAEhK8AQAAICHBGwAAABISvAEAACAhwRsAAAASErwBAAAgIcEbAAAAEhK8AQAAICHBGwAAABISvAEAACAhwRsAAAASqlXdBQAA1WPWgdtXdwlsgtbPf1TdJQCQJ0e8AQAAICHBGwAAABISvAEAACAhwRsAAAASErwBAAAgIcEbAAAAEhK8AQAAICHBGwAAABISvAEAACAhwRsAAAASErwBAAAgIcEbAAAAEhK8AQAAICHBGwAAABISvAEAACAhwRsAAAASErwBAAAgIcEbAAAAEso7eJeVlcVll10WzZs3j+Li4igtLY0xY8bkveEjjjgiatSoEeedd17eYwEAAGBLkXfw7tu3bwwePDh69+4dQ4YMiYKCgujWrVtMmjSpyut46KGH4oUXXsh30wAAALDFySt4T5kyJe6777649tprY9CgQdGvX78YO3ZstGzZMi699NIqrWPlypVxySWXxGWXXbZJBQMAAMCWJK/gPXr06CgoKIh+/frl2oqKiuLMM8+MF154Id55552NruOGG26ItWvXRv/+/fOvFgAAALYweQXvqVOnxm677RYNGjSo0N6xY8eIiJg2bdoGx8+bNy+uu+66uP7666O4uDi/SgEAAGALVCufzgsWLIhmzZpVai9ve/fddzc4/pJLLon99tsvTj755Hw2G2VlZVFWVpZ7vXTp0rzGAwAAQHXJK3ivWLEiCgsLK7UXFRXllq/PuHHj4sEHH4zJkyfnWWLEtddeGwMHDqzUPm/evKhfv37e6wO2XnPnzq3uEgCSMs9B1fm8kNKyZcuq3Dev4F1cXFzhyHO5lStX5pavy+rVq+P888+PH//4x3HAAQfks8mIiBgwYEBcfPHFuddLly6NkpKSaNGiRaXT3jdPH1Z3AfCt0bJly+ou4VvKPAffFPNcdTHPbYl8XkgpnzOx8wrezZo1i/nz51dqX7BgQURENG/efJ3j7rzzznjrrbfi9ttvjzlz5lRYtmzZspgzZ040bdo06tSps87xhYWF6zzSDgAAAJu7vG6u1r59+5g+fXqlZF9++nj79u3XOW7evHnx2WefxQ9+8INo1apV7ifi81DeqlWreOaZZzahfAAAANi85XXEu2fPnnHjjTfG0KFDc48DKysri+HDh0dpaWmUlJRExOdB+9NPP422bdtGRMTJJ5+8zlB+/PHHR7du3eLss8+O0tLSr7grAAAAsPnJK3iXlpZGr169YsCAAbFw4cJo06ZNjBw5MubMmRPDhg3L9evTp09MmDAhsiyLiIi2bdvmQviXtWrVKnr06LHpewAAAACbsbyCd8Tnp4ZffvnlMWrUqFi0aFG0a9cuHnvssTjkkENS1AcAAABbtLyDd1FRUQwaNCgGDRq03j7jx4+v0rrKj4gDAADA1iqvm6sBAAAA+RG8AQAAICHBGwAAABISvAEAACAhwRsAAAASErwBAAAgIcEbAAAAEhK8AQAAICHBGwAAABKqVd0FAAAApDDrwO2ruwQ2UevnP6ruEr5WjngDAABAQoI3AAAAJCR4AwAAQEKCNwAAACQkeAMAAEBCgjcAAAAkJHgDAABAQoI3AAAAJCR4AwAAQEKCNwAAACQkeAMAAEBCgjcAAAAkJHgDAABAQoI3AAAAJCR4AwAAQEKCNwAAACQkeAMAAEBCgjcAAAAkJHgDAABAQoI3AAAAJCR4AwAAQEKCNwAAACQkeAMAAEBCgjcAAAAkJHgDAABAQoI3AAAAJCR4AwAAQEKCNwAAACQkeAMAAEBCgjcAAAAkJHgDAABAQoI3AAAAJCR4AwAAQEKCNwAAACQkeAMAAEBCgjcAAAAkJHgDAABAQoI3AAAAJCR4AwAAQEKCNwAAACQkeAMAAEBCgjcAAAAkJHgDAABAQoI3AAAAJCR4AwAAQEKCNwAAACQkeAMAAEBCgjcAAAAkJHgDAABAQoI3AAAAJCR4AwAAQEKCNwAAACQkeAMAAEBCgjcAAAAkJHgDAABAQoI3AAAAJCR4AwAAQEKCNwAAACQkeAMAAEBCgjcAAAAkJHgDAABAQoI3AAAAJCR4AwAAQEKCNwAAACQkeAMAAEBCgjcAAAAkJHgDAABAQoI3AAAAJCR4AwAAQEKCNwAAACQkeAMAAEBCgjcAAAAkJHgDAABAQoI3AAAAJCR4AwAAQEKCNwAAACQkeAMAAEBCgjcAAAAkJHgDAABAQoI3AAAAJCR4AwAAQEKCNwAAACQkeAMAAEBCgjcAAAAkJHgDAABAQoI3AAAAJCR4AwAAQEKCNwAAACSUd/AuKyuLyy67LJo3bx7FxcVRWloaY8aM2ei4hx56KE466aTYZZddok6dOrH77rvHJZdcEosXL96UugEAAGCLkHfw7tu3bwwePDh69+4dQ4YMiYKCgujWrVtMmjRpg+P69esXb7zxRpx66qnx3//933HUUUfFLbfcEt///vdjxYoVm7wDAAAAsDmrlU/nKVOmxH333ReDBg2K/v37R0REnz59Yu+9945LL700nn/++fWOHT16dBx66KEV2jp06BCnnXZa3H333XHWWWflXz0AAABs5vI64j169OgoKCiIfv365dqKiorizDPPjBdeeCHeeeed9Y79cuiOiDj++OMjIuKNN97IpwwAAADYYuQVvKdOnRq77bZbNGjQoEJ7x44dIyJi2rRpeW38vffei4iIxo0b5zUOAAAAthR5nWq+YMGCaNasWaX28rZ33303r41ff/31UVBQED179txgv7KysigrK8u9Xrp0aV7bAQAAgOqSV/BesWJFFBYWVmovKirKLa+qe+65J4YNGxaXXnpp7Lrrrhvse+2118bAgQMrtc+bNy/q169f5W0CW7+5c+dWdwkASZnngG+DLWGuW7ZsWZX75hW8i4uLKxx5Lrdy5crc8qqYOHFinHnmmdG1a9f47W9/u9H+AwYMiIsvvjj3eunSpVFSUhItWrSodNr75unD6i4AvjVatmxZ3SV8S5nn4Jtinqsu5jn4Jm0Jc10+Z2LnFbybNWsW8+fPr9S+YMGCiIho3rz5RtfxyiuvRPfu3WPvvfeO0aNHR61aGy+hsLBwnUfaAQAAYHOX183V2rdvH9OnT6+U7CdPnpxbviGzZs2Ko446Kpo2bRpPPPFE1KtXL79qAQAAYAuTV/Du2bNnrFmzJoYOHZprKysri+HDh0dpaWmUlJRExOfXXr/55psVxr733ntx5JFHRs2aNePpp5+OJk2afA3lAwAAwOYtr1PNS0tLo1evXjFgwIBYuHBhtGnTJkaOHBlz5syJYcOG5fr16dMnJkyYEFmW5dqOOuqoePvtt+PSSy+NSZMmxaRJk3LLdthhhzjiiCO+ht0BAACAzUtewTsi4s4774zLL788Ro0aFYsWLYp27drFY489FocccsgGx73yyisREXHDDTdUWtapUyfBGwAAgK1S3sG7qKgoBg0aFIMGDVpvn/Hjx1dq++LRbwAAAPi2yOsabwAAACA/gjcAAAAkJHgDAABAQoI3AAAAJCR4AwAAQEKCNwAAACQkeAMAAEBCgjcAAAAkJHgDAABAQoI3AAAAJCR4AwAAQEKCNwAAACQkeAMAAEBCgjcAAAAkJHgDAABAQoI3AAAAJCR4AwAAQEKCNwAAACQkeAMAAEBCgjcAAAAkJHgDAABAQoI3AAAAJCR4AwAAQEKCNwAAACQkeAMAAEBCgjcAAAAkJHgDAABAQoI3AAAAJCR4AwAAQEKCNwAAACQkeAMAAEBCgjcAAAAkJHgDAABAQoI3AAAAJCR4AwAAQEKCNwAAACQkeAMAAEBCgjcAAAAkJHgDAABAQoI3AAAAJCR4AwAAQEKCNwAAACQkeAMAAEBCgjcAAAAkJHgDAABAQoI3AAAAJCR4AwAAQEKCNwAAACQkeAMAAEBCgjcAAAAkJHgDAABAQoI3AAAAJCR4AwAAQEKCNwAAACQkeAMAAEBCgjcAAAAkJHgDAABAQoI3AAAAJCR4AwAAQEKCNwAAACQkeAMAAEBCgjcAAAAkJHgDAABAQoI3AAAAJCR4AwAAQEKCNwAAACQkeAMAAEBCgjcAAAAkJHgDAABAQoI3AAAAJCR4AwAAQEKCNwAAACQkeAMAAEBCgjcAAAAkJHgDAABAQoI3AAAAJCR4AwAAQEKCNwAAACQkeAMAAEBCgjcAAAAkJHgDAABAQoI3AAAAJCR4AwAAQEKCNwAAACQkeAMAAEBCgjcAAAAkJHgDAABAQoI3AAAAJCR4AwAAQEKCNwAAACQkeAMAAEBCgjcAAAAkJHgDAABAQoI3AAAAJCR4AwAAQEKCNwAAACQkeAMAAEBCgjcAAAAkJHgDAABAQnkH77KysrjsssuiefPmUVxcHKWlpTFmzJgqjZ0/f36ceOKJ0ahRo2jQoEEcd9xx8fbbb+ddNAAAAGwp8g7effv2jcGDB0fv3r1jyJAhUVBQEN26dYtJkyZtcNzy5cujc+fOMWHChPjFL34RAwcOjKlTp0anTp3io48+2uQdAAAAgM1ZrXw6T5kyJe67774YNGhQ9O/fPyIi+vTpE3vvvXdceuml8fzzz6937K233hozZsyIKVOmxAEHHBAREUcffXTsvffe8bvf/S6uueaar7AbAAAAsHnK64j36NGjo6CgIPr165drKyoqijPPPDNeeOGFeOeddzY49oADDsiF7oiItm3bRpcuXeL+++/fhNIBAABg85fXEe+pU6fGbrvtFg0aNKjQ3rFjx4iImDZtWpSUlFQat3bt2nj11VfjjDPOqLSsY8eO8cwzz8SyZcuifv3669xuWVlZlJWV5V4vWbIkIiKWLl2aT/nVZs2q5dVdAptg2eqsuktgE2wp88LWxjy3ZTLPbZnMc9XDPLdlMs9tubaEua68xizb+N9ZXsF7wYIF0axZs0rt5W3vvvvuOsd9/PHHUVZWttGxu++++zrHX3vttTFw4MBK7esK+fB12a+6C2DTNGxY3RXAFsM8t4Uyz0GVmee2YFvQXLds2bJouJF68wreK1asiMLCwkrtRUVFueXrGxcRmzQ2ImLAgAFx8cUX516vXbs2Pv7449h+++2jRo0aVd8BqKKlS5dGSUlJvPPOO5XO8ADYGpjngK2deY7UsiyLZcuWRfPmzTfaN6/gXVxcXOGU73IrV67MLV/fuIjYpLERnwf2L4f2Ro0aValm+CoaNGhgoga2auY5YGtnniOljR3pLpfXzdWaNWsWCxYsqNRe3ra+pL/ddttFYWHhJo0FAACALVlewbt9+/Yxffr0She6T548Obd8nRupWTP22WefeOmllyotmzx5cuyyyy7rvbEaAAAAbMnyCt49e/aMNWvWxNChQ3NtZWVlMXz48CgtLc3d7GzevHnx5ptvVhr74osvVgjfb731VowdOzZ69er1VfYBvnaFhYVxxRVXrPO+BABbA/McsLUzz7E5qZFV5d7nX3DiiSfGX/7yl7jooouiTZs2MXLkyJgyZUr87W9/i0MOOSQiIg499NCYMGFChduqL1u2LPbbb79YtmxZ9O/fP2rXrh2DBw+ONWvWxLRp06JJkyZf754BAADAZiCvm6tFRNx5551x+eWXx6hRo2LRokXRrl27eOyxx3Khe33q168f48ePj4suuiiuvvrqWLt2bRx66KFx0003Cd0AAABstfI+4g0AAABUXV7XeAMAAAD5EbxhEx166KFx6KGHVncZwBZo5513jmOPPTb5dsaPHx81atSI8ePHJ98W8O1jLvtmfJPfOa+88sqoUaPGN7KtbxvBmy3aiBEjokaNGhV+mjZtGp07d44nn3yyWmr66KOP4mc/+1nsvvvuUVRUFNttt1107do1Hn/88WqpB/i/uWJdj7Ukf/PmzYtzzz03dt555ygsLIymTZtGjx494rnnnqvu0tZr8eLFUVRUFDVq1Ig33nhjnX369u0b9erV+4Yrg6ozl3195syZE6effnq0bt06ioqKYscdd4xDDjkkrrjiigr9br311hgxYkT1FMlWJe+bq8Hm6KqrropWrVpFlmXx/vvvx4gRI6Jbt27x6KOPfiP/Elvurbfeii5dusQHH3wQp59+euy///6xePHiuPvuu+PYY4+Nyy67LK677rpvrB6Ar9tzzz0X3bp1i4iIs846K/bcc8947733YsSIEXHwwQfHkCFD4qc//Wk1V1nZAw88EDVq1Igdd9wx7r777rj66quruySgmsycOTMOOOCAKC4ujjPOOCN23nnnWLBgQbz88stx/fXXx8CBA3N9b7311mjcuHH07du3+gpmqyB4s1U4+uijY//998+9PvPMM2OHHXaIe++99xsL3p999ln07NkzFi1aFM8++2yUlpbmll100UXRu3fvuP7666NDhw6eXQ9skRYtWhQ9e/aM4uLieO6556J169a5ZRdffHF07do1LrzwwujQoUMceOCB1VhpZXfddVd069YtWrZsGffcc4/gDd9iN910UyxfvjymTZsWLVu2rLBs4cKF1VTVN2P16tWxdu3a2Gabbaq7lG8dp5qzVWrUqFEUFxdHrVr/929Ln3zySVxyySVRUlIShYWFsfvuu8eNN94YX76x/+rVq+M3v/lNtG7dOgoLC2PnnXeOX/ziF1FWVrbBbT744IPx+uuvx89//vMKoTsioqCgIG6//fZo1KhRpVOYgM3DqlWr4te//nV06NAhGjZsGHXr1o2DDz44xo0bV6HfnDlzokaNGnHjjTfGH/7wh9hll12iTp06ceSRR8Y777wTWZbFb37zm9hpp52iuLg4jjvuuPj444/Xuc1nnnkm2rdvH0VFRbHnnnvGQw89VGH5xx9/HP3794999tkn6tWrFw0aNIijjz46XnnllUrr+ve//x09evSIunXrRtOmTeOiiy5a57w1ceLE6NWrV7Ro0SIKCwujpKQkLrroolixYsVG36Pbb7893nvvvRg0aFCF0B0RUVxcHCNHjowaNWrEVVddlWsvPzX22WefjXPOOSe23377aNCgQfTp0ycWLVpUaRtPPvlkHHzwwVG3bt2oX79+HHPMMfHPf/6zQp/yU8Lnz58fPXr0iHr16kWTJk2if//+sWbNmkrrnDdvXkycODFOPvnkOPnkk2P27Nnx/PPPb3R/YUtkLtv4XDZr1qzYaaedKoXuiIimTZvm/v/OO+8c//znP2PChAm5SxrLr7Wu6j6VX59+//33x29/+9vYaaedoqioKLp06RIzZ86stP2hQ4dG69ato7i4ODp27BgTJ06s1GdTfse///3vc99t//Wvf0VExKRJk+KAAw6IoqKiaN26ddx+++0bfe/YdI54s1VYsmRJfPjhh5FlWSxcuDBuvvnmWL58eZx66qkREZFlWXTv3j3GjRsXZ555ZrRv3z6efvrp+NnPfhbz58+Pm266Kbeus846K0aOHBk9e/aMSy65JCZPnhzXXnttvPHGG/GXv/xlvTU8+uijERHRp0+fdS5v2LBhHHfccTFy5MiYNWtWpS+tQPVaunRp/OlPf4pTTjklzj777Fi2bFkMGzYsunbtGlOmTIn27dtX6H/33XfHqlWr4qc//Wl8/PHHccMNN8SJJ54Yhx12WIwfPz4uu+yymDlzZtx8883Rv3//uOOOOyqMnzFjRpx00klx7rnnxmmnnRbDhw+PXr16xVNPPRVHHHFERES8/fbb8fDDD0evXr2iVatW8f7778ftt98enTp1in/961/RvHnziIhYsWJFdOnSJebNmxfnn39+NG/ePEaNGhVjx46ttJ8PPPBAfPrpp/GTn/wktt9++5gyZUrcfPPN8e9//zseeOCBDb5Hjz76aBQVFcWJJ564zuWtWrWKgw46KMaOHRsrVqyI4uLi3LLzzjsvGjVqFFdeeWW89dZbcdttt8XcuXNzX0ojIkaNGhWnnXZadO3aNa6//vr49NNP47bbbouDDjoopk6dGjvvvHNufWvWrImuXbtGaWlp3HjjjfHXv/41fve730Xr1q3jJz/5SYW67r333qhbt24ce+yxUVxcHK1bt4677757szsqD18Hc9nG57KWLVvGX//61xg7dmwcdthh6+33+9//Pn76059GvXr14pe//GVEROywww557VO56667LmrWrBn9+/ePJUuWxA033BC9e/eOyZMn5/oMGzYszjnnnDjwwAPjwgsvjLfffju6d+8e2223XZSUlGzy73j48OGxcuXK6NevXxQWFsZ2220Xr732Whx55JHRpEmTuPLKK2P16tVxxRVX5PaPBDLYgg0fPjyLiEo/hYWF2YgRI3L9Hn744SwisquvvrrC+J49e2Y1atTIZs6cmWVZlk2bNi2LiOyss86q0K9///5ZRGRjx47NtXXq1Cnr1KlT7nX79u2zhg0bbrDewYMHZxGR/e///u8m7jGwKcrnihdffHG9fVavXp2VlZVVaFu0aFG2ww47ZGeccUaubfbs2VlEZE2aNMkWL16cax8wYEAWEdm+++6bffbZZ7n2U045Jdtmm22ylStX5tpatmyZRUT24IMP5tqWLFmSNWvWLNtvv/1ybStXrszWrFlToabZs2dnhYWF2VVXXZVr+/3vf59FRHb//ffn2j755JOsTZs2WURk48aNy7V/+umnlfb92muvzWrUqJHNnTt3ve9PlmVZo0aNsn333XeDfc4///wsIrJXX301y7L/e+87dOiQrVq1KtfvhhtuyCIie+SRR7Isy7Jly5ZljRo1ys4+++wK63vvvfeyhg0bVmg/7bTTsoio8B5kWZbtt99+WYcOHSrVtM8++2S9e/fOvf7FL36RNW7cuMLvqXy9devW3eD+QXUyl43LtX+Vuez111/PiouLs4jI2rdvn11wwQXZww8/nH3yySeV+u61114Vvu/lu0/jxo3LIiLbY489KvxehgwZkkVE9tprr2VZlmWrVq3KmjZtmrVv375Cv6FDh2YRUaGGfH/HDRo0yBYuXFihf48ePbKioqIK79W//vWvrKCgIBMR03CqOVuFP/zhDzFmzJgYM2ZM3HXXXdG5c+c466yzcqc6PfHEE1FQUBDnn39+hXGXXHJJZFmWuwP6E088ERGfX6v45X4RscE7ky9btizq16+/wTrLly9btiyPvQO+CQUFBblr3tauXRsff/xxrF69Ovbff/94+eWXK/Xv1atXNGzYMPe6/BKTU089tcJlLqWlpbFq1aqYP39+hfHNmzeP448/Pve6/PTrqVOnxnvvvRcREYWFhVGz5uf/qV6zZk189NFHUa9evdh9990r1PTEE09Es2bNomfPnrm2OnXqRL9+/SrV/cWj0J988kl8+OGHceCBB0aWZTF16tQNvkf5zHNLly6t0N6vX7+oXbt27vVPfvKTqFWrVm7eHTNmTCxevDhOOeWU+PDDD3M/BQUFUVpaWukUyoiIc889t8Lrgw8+ON5+++0Kba+++mq89tprccopp+Tayrfx9NNPb3BfYEtkLtv4XLbXXnvFtGnT4tRTT405c+bEkCFDokePHrHDDjvE//zP/2xwbLmq7lO5008/vcJ11QcffHBERG7Oeumll2LhwoVx7rnnVujXt2/fCr+fiPx/xz/84Q+jSZMmuddr1qyJp59+Onr06BEtWrTIte+xxx7RtWvXKu0/+RO82Sp07NgxDj/88Dj88MOjd+/e8fjjj8eee+4Z5513XqxatSrmzp0bzZs3r/SFcY899oiIiLlz5+b+t2bNmtGmTZsK/Xbcccdo1KhRrt+61K9ff6OBunz5F68fAjYfI0eOjHbt2kVRUVFsv/320aRJk3j88cdjyZIllfp+8ctKROS+GH3xdMAvtn/5euY2bdpUelbqbrvtFhGfX5cX8fkXqptuuil23XXXKCwsjMaNG0eTJk3i1VdfrVDT3Llz17m+3XffvVLd8+bNi759+8Z2222Xuza6U6dOERHr3M8vymee+/J8u+uuu1Z4Xa9evWjWrFluX2fMmBEREYcddlg0adKkws8zzzxT6YZHRUVFFb5IRkRsu+22ld7nu+66K+rWrRu77LJLzJw5M2bOnBlFRUWx8847x913373BfYEtlblsw3NZeY2jRo2KDz/8MF599dW45pprolatWtGvX7/461//utHxVd2ncl9+n7fddtuI+L/3s/w75pfnytq1a8cuu+xSaX35/I5btWpV4fUHH3wQK1asqLStiHW/13w9XOPNVqlmzZrRuXPnGDJkSO7LXD6+POFXxZ577hnTpk2LefPmVZpcy7366qsREeucQIHqddddd0Xfvn2jR48e8bOf/SyaNm0aBQUFce2118asWbMq9S8oKFjnetbXnn3pRo5Vcc0118Tll18eZ5xxRvzmN7+J7bbbLmrWrBkXXnhhrF27Nu/1rVmzJo444oj4+OOP47LLLou2bdtG3bp1Y/78+dG3b9+NrnOPPfaIqVOnRllZWRQWFq6zz6uvvhq1a9de5xe6DSnf9qhRo2LHHXestPyLR94i1v8+f1GWZXHvvffGJ598EnvuuWel5QsXLozly5d7djdbFXPZxueyLyooKIh99tkn9tlnn/j+978fnTt3jrvvvjsOP/zwr3Wfvs73M9/f8RfPDqD6CN5stVavXh0REcuXL8/dROPLp0m++eabERG5u1q2bNky1q5dGzNmzMgdDY+IeP/992Px4sXrvPtluf/4j/+Ie+65J+6888741a9+VWn50qVL45FHHonvfve7gjdshkaPHh277LJLPPTQQxX+8S3VkwhmzpwZWZZV2Nb06dMjInI3ERs9enR07tw5hg0bVmHs4sWLo3HjxrnXLVu2jNdff73S+t56660K41577bWYPn16jBw5ssKNIMeMGVOlmo899th44YUX4oEHHsjdvPKL5syZExMnTozDDz+80he9GTNmROfOnXOvly9fHgsWLMg9E7z8hpNNmzbd6BfeqpowYUL8+9//jquuuqrCnB7x+VGmfv36xcMPP7zOfYEtlbls05U/mnbBggW5tvUdjKnqPlVV+XfMGTNmVLjh22effRazZ8+Offfdt8K2v8rvuEmTJlFcXLzOg1Nffq/5+jjVnK3SZ599Fs8880xss802sccee0S3bt1izZo1ccstt1Tod9NNN0WNGjXi6KOPjojIfQH8/e9/X6Hf4MGDIyLimGOOWe82f/jDH8Zee+0V1113Xbz00ksVlq1duzZ+8pOfxKJFi3J3xQQ2L+VHI7549GHy5MnxwgsvJNneu+++W+FJCUuXLo0777wz2rdvnzviW1BQUOloyAMPPFDpGstu3brFu+++G6NHj861ffrppzF06NAK/da1j1mWxZAhQ6pU8znnnBNNmzaNn/3sZ5WupV65cmWcfvrpkWVZ/PrXv640dujQofHZZ5/lXt92222xevXq3PzbtWvXaNCgQVxzzTUV+pX74IMPqlTjF5WfZv6zn/0sevbsWeHn7LPPjl133dXp5mx1zGUbN3HixHXOM+X3nPji6dZ169aNxYsXV+pb1X2qqv333z+aNGkSf/zjH2PVqlW59hEjRlTa/lf9HRcUFETXrl3j4Ycfjnnz5uXa33jjDfe+SMgRb7YKTz75ZO7o9cKFC+Oee+6JGTNmxM9//vNo0KBB/Md//Ed07tw5fvnLX8acOXNi3333jWeeeSYeeeSRuPDCC3NHWvbdd9847bTTYujQobF48eLo1KlTTJkyJUaOHBk9evSocLTmy2rXrh0PPvhgHHbYYXHQQQfF6aefHvvvv38sXrw47rnnnnj55ZfjF7/4RZxwwgnfyHsCVHbHHXfEU089Van9ggsuiGOPPTYeeuihOP744+OYY46J2bNnxx//+MfYc889Y/ny5V97LbvttluceeaZ8eKLL8YOO+wQd9xxR7z//vsxfPjwXJ9jjz02rrrqqjj99NPjwAMPjNdeey3uvvvuSmfNnH322XHLLbdEnz594h//+Ec0a9YsRo0aFXXq1KnQr23bttG6devo379/zJ8/Pxo0aBAPPvjgOp+nvS7bb799jB49Oo455pj47ne/G2eddVbsueee8d5778WIESNi5syZMWTIkHU+pmvVqlXRpUuXOPHEE+Ott96KW2+9NQ466KDo3r17RHx+Q6bbbrstfvzjH8d3v/vdOPnkk6NJkyYxb968ePzxx+MHP/hBpX883ZCysrJ48MEH44gjjoiioqJ19unevXsMGTIkFi5c6N4bbFHMZV9tLrv++uvjH//4R5xwwgnRrl27iIh4+eWX484774ztttsuLrzwwlzfDh06xG233RZXX311tGnTJpo2bRqHHXZYlfepqmrXrh1XX311nHPOOXHYYYfFSSedFLNnz47hw4dXWufX8TseOHBgPPXUU3HwwQfHf/7nf8bq1avj5ptvjr322it3aSRfs2/yFurwdVvX48SKioqy9u3bZ7fddlu2du3aXN9ly5ZlF110Uda8efOsdu3a2a677poNGjSoQp8sy7LPPvssGzhwYNaqVausdu3aWUlJSTZgwIAKj8/IssqPEyv3wQcfZJdccknWpk2bbJtttsnVNWzYsCTvAbBx63v0YPnPO++8k61duza75pprspYtW2aFhYXZfvvtlz322GPZaaedlrVs2TK3rvLHswwaNKjCNsofGfPAAw+sc9tffPxPy5Yts2OOOSZ7+umns3bt2mWFhYVZ27ZtK41duXJldskll2TNmjXLiouLsx/84AfZCy+8sM75Z+7cuVn37t2zOnXqZI0bN84uuOCC7Kmnnqr0CJ5//etf2eGHH57Vq1cva9y4cXb22Wdnr7zyShYR2fDhw6v0fs6ePTs7++yzsxYtWmS1a9fOGjdunHXv3j2bOHHiet/7CRMmZP369cu23XbbrF69elnv3r2zjz76qFL/cePGZV27ds0aNmyYFRUVZa1bt8769u2bvfTSS7k+63vs1xVXXJF7DM6DDz640bl3/PjxWURkQ4YM2eB6YXNhLhuX6/dV5rLnnnsu+6//+q9s7733zho2bJjVrl07a9GiRda3b99s1qxZFfq+99572THHHJPVr1+/wmO9qrpP63s/y9//L9d66623Zq1atcoKCwuz/fffP3v22WcrrfOr/o7LTZgwIevQoUO2zTbbZLvsskv2xz/+scI8yterRpZtwhX9QJW99tprcfDBB0dJSUlMmjSp0iMhALZmI0aMiNNPPz1efPHF3PWTAPBt4xpvSGyfffaJRx55JGbMmBE9evSocN0OAACw9XONN3wDOnXqFCtXrqzuMgAAgGrgiDcAAAAk5BpvAAAASMgRbwAAAEhI8AYAAICEBG8AAABISPAGAACAhARvAAAASEjwBgAAgIQEbwAAAEhI8AYAAICEBG8AAABI6P8BngK1I4EiFMEAAAAASUVORK5CYII=\n"
          },
          "metadata": {}
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "As we can see this Hybrid Pruning system in which we use both static data and activation-based data to decide which neurons to keep produces a very high quality model.\n",
        "\n",
        "If we compare it with the data obtained by the model pruned only using the static data from the `gate_proj` and `up_proj` layers [(6_3_pruning_structured_llama3.2-1b_OK.ipynb)](https://github.com/peremartra/Large-Language-Model-Notebooks-Course/blob/main/6-PRUNING/6_3_pruning_structured_llama3.2-1b_OK.ipynb) the quality improvement in the Lambada benchmarks is remarkable.\n",
        "\n",
        "| Model | BoolQ | Lambada OpenAI | Lambada Standard |\n",
        "|---|---|---|---|\n",
        "| Base.  | 0.637 | 0.619 | 0.532 |\n",
        "| Pruning Hybrid | 0.622 | 0.445 | 0.370  |\n",
        "| Pruning Static | 0.622 | 0.293 | 0.241  |\n",
        "\n",
        "The method serves not only to create specialist models, but also to create generalist models using a generalist dataset such as wikitext."
      ],
      "metadata": {
        "id": "90wNDvvuauVL"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "##Authors Note.\n",
        "In addition to creating content like this notebook and offering it under the MIT license, I have also contributed to repositories such as those of Hugging Face and Google Gemini.\n",
        "\n",
        "I am especially proud of my book: <a href=\"https://amzn.to/4eanT1g\"><b>Large Language Models:</b> Apply and Implement Strategies for Large Language Models</a> (Apress).\n",
        "\n",
        "You can find it on both <a href=\"https://amzn.to/4eanT1g\">Amazon</a> and <a href=\"https://link.springer.com/book/10.1007/979-8-8688-0515-8\">Springer</a>, where they often have good deals on the purchase price.\n",
        "\n",
        "If you take a look and end up purchasing it, keep in mind that you can reach out with any questions via the Discussions section of this same repository or on any of my social media channels. I’ll do my best to respond as quickly as possible."
      ],
      "metadata": {
        "id": "Sd3Jpqaic39j"
      }
    }
  ],
  "metadata": {
    "colab": {
      "gpuType": "T4",
      "provenance": [],
      "include_colab_link": true
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    },
    "language_info": {
      "name": "python"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}